1. 1
Research Methods in Health
Chapter 6. Sampling Design
Young Moon Chae, Ph.D.
Graduate School of Public Health
Yonsei University, Korea
ymchae@yuhs.ac
2. 2
Table of Contents
• Introduction
• Population and sampling
• Probability sampling and non-probability sampling
• Types of probability sampling
• Determination of sample size
3. 3
Introduction to Sampling Design
• Selection of some part of an aggregate or totality on the basis of which a
judgment or inference (statistical) about the aggregate or totality is made
• Items so selected constitute a sample
• The selection process or technique is sample design
• Survey conducted on the basis of sample is sample survey
• A complete enumeration of all the items in the ‘Population’ is called a
census inquiry
• Census inquiry requires enormous time, money & energy; slightest bias
may get magnified when number of observations are increased
• It is possible to obtain sufficiently accurate results by studying only a part (a
miniature cross section ) of the population
4. 4
Basic Concepts
• UNIVERSE: Totality of items or units in any field of enquiry
• POPULATION
- Total of items about which information is desired aggregate of elementary units (finite or infinite, N)
possess at least one common characteristics -real or hypothetical
• ELEMENTARY UNITS
- Units possessing the relevant characteristics i.e., attributes that are the object of study (operational
definition)
• SAMPLING FRAME:
- A list of all the units of population
- Perfect frames seldom exists; elementary units or cluster of such units or the group form the basis for
frame for finite population
- Frame is either constructed by the researcher or some existing list of population is used
- Sampling frame should be a good representative of the population and as far as possible free from (i)
Incompleteness (ii) Inaccurateness (iii)Inadequateness (iv) Out-of-date (complete, no missing element, no
ineligibles, no element should appear more than once & up to date)
• SAMPLING DESIGN: (Target population →Survey population)
- A definite plan for obtaining a sample from the sampling frame
- Refers to technique or procedure adopted by the researcher
5. 5
Characteristics of a Good Sampling Design
• Truly representative (Representativeness is not the property of the sample but
of the procedure by which the sample is obtained)
• Having small sampling error
• Economically viable
• Systematic bias is controlled (in a better way)
• Results can be applied to the universe in general with a reasonable level of
confidence ⇒reliability
• Optimum size (adequately large)
• Similar to population / universe
• Should have all the characteristics that are present in the population / universe
6. 6
Population and Sampling
• Sampling is the process of selection of a number of units
from a defined study population.
• The process of sampling involves:
1. Identification of study population
2. Determination of sampling population
3. Definition of the sampling unit
4. Choice of sampling method
5. Estimation of the sample size
7. 7
Identification of study population
• The study or target population is the one upon which the results of the study
will be generalized.
• It is crucial that the study population is clearly defined, since it is the most
important determinant of the sampling population
8. 8
Determination of sampling population
• The sampling population is the one from which the sample is drawn.
• The definition of the sampling population by the investigator is governed by
two factors:
- Feasibility: reachable sampling population
- External validity: the ability to generalize from the study results to the target
population.
9. 9
Aims and Advantages of Sampling
• By examining a sample it saves money. Obviously the assessment of a
sample of 100 from a population of 1000 will reduce costs considerably not
only because the sample consists of fewer people, but also because
sampling saves labour - only a smaller number of staff are needed for both
the fieldwork and the processing of data.
• Sampling also saves time. An estimate of the characteristics of the
population can be gathered in a short time. As a result of reduced costs and
the time saved through sampling, more attention can be given to each
subject hence, increasing the accuracy of the analysis.
• Sampling allows us to deduce characteristics of a population while
observing only some of its members (the sample). In order to achieve this
aim it is necessary to avoid selection bias.
• Sampling allows for more accurate measurements
− Inspection fatigue is reduced (non-sampling error)
− Sampling error can be studied, controlled & probability statement can be made about
magnitude
10. 10
Bias in the Sampling
• Bias in the selection of a sample can arise if:
•The sampling is done by a non-random method, which generally means the
selection is influenced by human choice.
• The sampling frame (census, phone book) which serves as the basis for
selection does not cover the population adequately, completely or
accurately.
• Groups of the population are not represented because you can't find them
or they refuse to participate.
11. 11
Choice of sampling method
• Non probability sampling
- In non-probability sampling, there is no assurance that each individual has a
known chance of being selected
• Probability sampling
- Probability sampling means that each individual in the population has a known
probability or chance of selection into the sample
12. 12
Non-probability Sampling:
• Types of non-probability sampling:
- Convenience sampling
- Quota sampling
üSelection based on some basic parameters like age, sex, income, etc. of population so as
to make it representative
üField workers are assigned quotas of number of units satisfying the required
characteristics for collecting data
üLooks similar to stratified sampling, but differs in the fact that the discretion of field worker
is not found in ‘Stratified sampling’ which makes random sample from each cell
üDifficult to obtain an accurate & up to -date proportion of respondents assigned to each
cell
• Not recommended in medical research
- It is by far the most biases sampling procedure as it is not random (not everyone
in the population has an equal chance of being selected to participate in the study).
13. 13
Probability Sampling
• Assigns equal probability to each unit of the population i.e., every element
has equal (non-zero) chance of being selected (equi-probability)
• Random does not mean haphazard
• Errors of estimation or significance of results obtained can be measured
• Best technique for representative sample
• Ensures the law of statistical regularity (i.e.. on an average,sample chosen
will have the same composition and structure as the universe / population)
• Even each combination (possible ) of sample will have equal probability of
being picked
• All choices are independent of one another
14. 14
Types of Probability Sampling
• Simple random sampling
• Systematic random sampling
• Stratified random sampling
• Multi-stage random sampling
• Cluster sampling
• Multi-phase sampling
15. 15
Simple Random Sampling
• In this method, all subject or elements have an equal probability of being
selected. There are two major ways of conducting a random sample.
• SRS is the basic selection process & all other procedures are viewed as
modifications of SRS
• The first is to consult a random number table, and the second is to have the
computer select a random sample.
e.g., Throwing of a fair dice. Probability of getting one is same for each of 20
sample throws and the 20 throws are all independent
17. 17
Advantage
• Decreases external validity of the results
• Allows for a smaller sample size
• Eliminates the influence of confounding factors
(cont.)
18. 18
Systematic Random Sampling
• A systematic sample is conducted by randomly selecting a first case on a
list of the population and then proceeding every Nth case until your sample
is selected. This is particularly useful if your list of the population is long.
• For example, if your list was the phone book, it would be easiest to start at
perhaps the 17th person, and then select every 50th person from that point
on.
19. 19
(cont.)
• Advantage of systematic sampling
− Easier to use & less costlier for large population
− Sample is spread more evenly over the entire population
− Elements can be ordered in a manner found in the universe
− Can be used even without list of units in the population
• Disadvantage
− If there is hidden periodicity or intrinsic cyclical relationship between
elements of the population, it is dangerous
e.g.:10 sample days selected from 70 consecutive days for number of
issues in a library
20. 20
Stratified Sampling
• In a stratified sample, we sample either proportionately or equally to
represent various strata or subpopulations.
• In the proportional allocation, the same sampling ratio may be used in all
strata. That is, the number of individuals chosen in each stratum I
proportional to the size of the stratum
• If a uniform sampling fraction was used, the sample is ‘self-weighting’ and
can for some purposes be treated as if it were a simple random or
systematic sample
• For example if our strata were cities in a country we would make sure and
sample from each of the cities. If our strata were gender, we would sample
both men and women.
21. 21
(cont.)
Steps for stratified sampling
• Divide population into several sub-population or strata which are more
homogeneous than population based on factors or characteristics more closely
related to the purpose of study, past experience, personal judgment and
consideration of relationship between the characteristics of population and
characteristics to be estimated. Conduct a pilot study if necessary and examine the
variances within and among strata.
• Make simple random sampling from each strata
- If stratification is implicit in ordering the list, systematic random sampling will serve the
purpose
- e.g., Students of school by the standard in which studying
• Decide on the size of the sample for each strata based on: (I) Size(ii) Variability &
(iii) Sampling cost of strata
- A uniform sampling fraction: Sample size is proportional to the size, variability & cost ( of
sampling ) of strata
- A variable sampling fraction : may be appropriate when there is special interest in some sub
groups or strata.( Weighting adjustments may be necessary while analyzing data)
23. 23
Advantage
• There is less sampling variation than with simple random or systematic
sampling
• If the strata are more uniform than the total population with respect to other
attributes, it reduces sampling variation with respect to other properties also
• The greater the differences between the strata and the less the differences
within the strata. The greater is the gain due to stratification
• It is possible to compare the differences in mean between the strata by
using ANOVA
(cont.)
24. 24
Multi-stage Sampling
• This method is used in large-scale surveys. A sample of first—stage
sampling units is chosen, each of the selected units is divided into second-
stage units, samples of second-stage units are selected, and so on
• Different methods (simple random, stratified, systematic or cluster sampling)
may be used at any stage
• The first-stage units may be province, villages, or other aggregations. This
method ha the same advantages as cluster sampling – less travel by
interviewers, no need for a sampling frame showing all individuals in the
population, etc.
26. 26
Cluster Sampling
• In cluster sampling we take a random sample of strata and then survey
every member of the group.
• That is, the sampling units are clusters, and the sampling frame is a list of
these clusters.
• For example, if our strata were individuals schools in a city, we would
randomly select a number of schools and then test all of the students within
those schools.
28. 28
(cont.)
• Advantage
- Less traveling by interviewers, fewer school teachers to negotiate with
- No need for a sampling frame showing all individuals in the population
- Better field supervision
• Disadvantage
- If the clusters contain similar persons (high intra-class correlation), it is difficult to
estimate the precision with which generalizations may be made to the parent
population
- A large number of small clusters is preferable to a small number of large clusters
29. 29
Determination of Sample Size
• Nature of universe / population
- Heterogeneous or homogenous
- Dispersion factor (or variability)
• Number of variables to be studied
• Number of groups & sub-groups proposed
- Special interest in subgroups
• Nature of study (qualitative or quantitative)
- Intensive & continuous or general survey
• Sampling design or type of sample
30. 30
(cont.)
• Intended depth of analysis
• Precision and reliability (acceptable confidence level) required
• Level of non-response (item & unit) expected
• Available finance and other resources (trained investigators)
• Size of population
• Nature of units in the population
• Size of questionnaire
NOTE: Size is more important than the proportion of the population represented by sample
31. 31
Sample Size when Estimating a Percentage
or Proportion
Confidence interval for population proportion, ^p
^P = ^P ±e for infinite population
^P = P ±e √(N-n) / N-1) for finite population
where, P = Sample proportion, q = 1-PN = Size of population, n = Size of sample
e = Acceptable error = Z = √pq/ n
Z = Standard variate= ^P-P / √pq/n (Available from table, for given confidence level)
n = Z2
pq / e2
for infinite population
n = Z2
pqN / (N-1) e2
+ Z2
pq for infinite population
33. 33
References
Abramson JH. 1984. Survey methods in community medicine: An
introduction to epidemiological and evaluative studies. Churchill Livingston.
pp. 70-80.
Booth WC, et. al. The craft of research.2 ed. Chicago: The University of
Chicago Press, 2003.6
Fleiss JL 1981. Statistical methods for rates and proportions. Wiley.
Lavrakas P. 1987. Telephone survey methods: sampling, selection, and
supervision. Sage Publications.
Lemeshow S. et al. 1990. Adequacy of sample size in health studies. Wiley.
Levy PS., Lemeshow, S. 1991. Sampling of populations: methods and
applications. Wiley.
Moser CA, Kalton G. 1986. Survey methods in social investigation. Gower
Publishing Company.
Shuster JJ. 1989. CRC handbook of sample size guidelines for clinical trials.
CRC Press.