This document discusses key concepts in sampling and statistical inference. It defines parameters and statistics, and explains sampling distributions including the sampling distribution of the mean, proportion, and difference between means. The central limit theorem is covered, stating that as sample size increases, the sampling distribution of the mean approaches a normal distribution. Common distributions used in statistical inference like the t, F, and chi-square distributions are also summarized.
3. • Universe/Population(finite or
infinite)
• sampling units
• Sampling frame
• Sampling design
• Statistic/parameter(certain
measures such as mean,
median, mode or the like ones
from samples, then they are
called statistic(s) for they
describe the characteristics of a
sample. But when such
measures describe the
characteristics of a population,
they are known as parameter(s)
• Sampling error
• Precision
• Confidence interval
4. Parameter
• Population characteristic
e.g. μ, σ, P, median,
percentiles etc
Sample statistic
• Any quantity computed
from values in a sample
e.g. x, s, sample
proportion etc.
5. • Concept of a sampling
distribution is perhaps the
most basic concept in
inferential statistics.
• The sampling distribution
of the mean
• The sampling distribution
of the difference between
means
• The sampling distribution
of r
• The sampling Distribution
of a Proportion
• Central Limit Theorem
The central limit theorem
states that:
• Given a population with a
finite mean μ and a finite
non zero variance σ2, the
sampling distribution of
the mean approaches a
normal distribution with a
mean of μ and a variance
of σ2/N as N, the sample
size, increases
6. • Example of the Central
Limit Theorem in Practice:
Roll 30 dice and calculate the
average (sample mean) of the
numbers that you get on each
die. Now repeat this
experiment 1000 times each
time rolling 30 dice and
computing a new sample
mean. Plot a histogram of the
1000 sample means that you
have obtained. This plot will
look approximately normal
7.
8. Sampling distribution of mean refers to the probability
distribution of all the possible means of random samples of
a given size that we take from a population. If samples are
taken from a normal population
9. • Usually the statistics of attributes correspond to
the conditions of a binomial distribution that
tends to become normal distribution as n
becomes larger and larger. If p represents the
proportion of defectives i.e., of successes and q
the proportion of non defectives i.e., of failures
(or q = 1 – p) and if p is treated as a random
variable, then the sampling distribution of
proportion of successes has a mean = p with
standard deviation = p × q n, where n is the
sample size.
10. • The variable t differs from z in the sense that
we use sample standard deviation in the
calculation of t, whereas we use standard
deviation of population in the calculation of z.
There is a different t distribution for every
possible sample size i.e., for different degrees
of freedom. The degrees of freedom for a
sample of size n is n – 1. As the sample size
gets larger, the shape of the t distribution
becomes approximately equal to the normal
distribution
11. • Two independent normal populations, having the
same variance
• The calculated value of F from the sample data is
compared with the corresponding table value of F
and if the former is equal to or exceeds the latter,
then we infer that the null hypothesis of the
variances being equal cannot be accepted. We
shall make use of the F ratio in the context of
hypothesis testing and also in the context of
ANOVA technique.
12. • Distribution is not symmetrical and all the values are
positive with (n – 1) degrees of freedom.
• Chi-square distribution is encountered when we deal with
collections of values that involve adding up squares.
13. • A point estimator is a formula that uses sample data to
calculate a single number (a sample statistic) that can be
used as an estimate of a population parameter. e.g. ¯x, s
to calculate μ, σ,