2. 2
1. Population & Sample
2. Descriptive Statistics
1. Central tendency
2. Spread (or ) Dispersion
3. Normal Distribution
4. Central Limit Theorem
5. Correlation, covariance
6. Probability
7. Statistical tests & Significance level
8. Random Variables
9. Other Distributions
Contents
3. 3
Population & Sample
Population: is the complete set of items of a characteristic of given
subject of analysis
Sample: A sample is a part of the population under study selected
so that inferences can be drawn from it about the population
Primary data: Data collected for actual analysis of the subject
Secondary Data: Already collected data used for analyzing other
factors of the subject
Measurement scales :
Nominal, ordinal, interval, ratio
5. 5
Descriptive statistics
Central Tendency: Median
Median: is the middle term in an ordered set
: if ‘N’ is odd number
=
: if ‘N’ is even number
Ex: S = {1,4,7,7,9,11,23}
median of S = 4th
term of set
= 7
xN
2
1+
{ 2
2
1
2
xx NN ++
8. 8
Descriptive statistics
for the given sample S = {1, 2,5,7,9,11,12,15,}
Min:1
Max:15
Range = Max – Min
= 15 -1 = 14
IQR: Inter Quartile Range
= Q3 - Q1
= 11.5 – 3.5 Q3 = (11+12)/2 = 11.5
= 8 Q1 = (2+5)/2 = 3.5
Q1: middle term in the 1st
half of the ordered set
Q2: Median
Q3: middle term in the 2nd
half of the ordered set
{
9. 9
Normal Distribution
Normal (or) Gaussian Distribution:
- Is a bell curve with Mean at the center and
- 68% data distributed in 1 Standard Deviation range
- 95.4% data distributed in 2 Standard Deviations range
- 99.7% data distributed in 3 Standard Deviations range
),( σµN
10. 10
Normal Distribution
In a perfect Normal Distribution,
1. parameters Mean, Median, Mode are aligned
2. data is symmetrically distributed around these parameters
),( σµN
11. 11
Standardized Normal Distribution
)1,0(N
To standardize the distribution,
Step 1. center the mean (minus the mean from all the data points)
Step 2. divide each value by the standard deviation
Standardized Normal Distribution:
Isa normal distribution with mean=0 and standard
deviation=1
14. 14
Correlation
Correlation: Cor(X,Y) = [-1,1]
-Correlation indicates the strength of association
between two variable
- it ranges from -1 to +1
cor(x,y ) = 1 // being perfectly positively correlated
cor(x,y) = -1 // being perfectly negatively correlated
cor(x,y) = 0 // being independent
Cor(X,Y) =
16. 16
Covariance
Covariance: is a measure of how two variables are changing
together.
- So correlation is the covariance normalized by the standard
deviation of the 2 variables
17. 17
Probability
Terminology:
Sample space: set of all possible outcomes of an experiment
Event: is a subset of sample space
Probability: is the measure of the likeliness that an event will
occur among all the possible outcomes.
Probability of ‘event x’ =
Example:
Event: tossing a fair coin
Sample Space = {HEAD,TAIL}
Probability of ‘Head’ = P(HEAD) = ½ = 0.5
Probability of ‘Tail’ = P(TAIL) = ½ = 0.5
salloutcome
likelyness
xp =)(
18. 18
Probability cont..
Axioms of probability
1.
2. ,
3. Sum of all the probabilities of events in a sample space
is equal to 1
4.
]1,0[)( =xp
)(xpc
)(1)( xpxpc −=
)(xp are mutually exclusive if,
1..)()( 21 =++ xpxp
)()()()( BAPBPAPBAP ∩−+=∪
19. 19
Conditional Probability
Conditional Probability: is the probability of
event A given event B occurred
Where,
: Probability of A conditioned on B
: Joint probability of A and B
: Probability of B
for independent events A, B
)(
)(
)/(
BP
BAP
BAP
∩
=
∩
)(
)(
)/(
BP
BAP
BAP
)(*)()( BPAPBAP =∩
)(
)(
)(*)(
)/( AP
BP
BPAP
BAP ==
∴
21. 21
Naïve Bayesian
Naïve Bayesian: Bayes Rule + Conditional independence
If ‘E’ is the event conditioned on x1,x2, .. xn, then
by applying the chain rule & independence condition we
get the final equation
∑=
= n
i
ii
n
n
xPx
ExPExPExPEP
xxxEP
1
*
21
2,1
)(
)/(..*)/(*)/(*)(
),../(
)/(...*)/(*)/(*)( 21 ExPExPExPEP n),../( 2,1 nxxxEP α
∏=
n
i
i ExPEP
1
)/(*)(α
23. 23
Random Variable
Random Variables & Probability Distribution:
Discrete:
Probability mass Function
Cumulative mass function
Continuous
Probability Density Function
Cumulative density function