SlideShare una empresa de Scribd logo
1 de 45
Chapter 5
Producing Data
5.1 DESIGNING SAMPLES
Some notes before we begin
• We are entering the second part of the
statistics course “Experimental Design”
• In most real life applications, experimental
design begins the process of statistics
• Provided experiments (and surveys) are
carefully designed, we can use the
techniques of statistics to analyze the
results with increased “significance”
• Much of this material is covered in social
science courses (i.e. psychology)
Population and Sample
Population-
• The entire group of individuals for which
information is produced
Sample-
• A subset of the population that is
examined in greater detail
• Results of the sample are generalized to
the population.
Sample vs. Census
Census
• Information gathered from the entire
population (no exceptions!)
• Produces the most accurate description
of the population
• Usually expensive or impossible
Samples
• By their nature, the success or failure of a
study or experiment depends on good
technique in sampling
• We want our sample to “look like” our
population
– We would like to minimize the effect of outlier
observations
– We would like to decrease ‘variability’ in our
sample
– We would like to decrease ‘bias’
Some ‘bad’ sampling techniques
Voluntary Response Sampling
• Most often seen as a ‘call-in’ poll or an
‘internet poll’
• People with strong, often negative
opinions are most likely to respond
• Polls are easily “fixed”
• This sampling technique and its’ results
are not to be trusted!
Some ‘bad’ sampling techniques
Convenience Sampling
• Individuals in the sample consist of those who are
easiest to reach
• Mall interviews
– The sample is only valid for people who visit the
mall (this is not everyone!)
– The sample tends to consist of the “easiest targets”
• Some telephone studies
• This is not to say that samples must be difficult
to construct, they just cannot consist of only the
easiest individuals to sample
Bias
• In statistics, bias refers to the systematic
favoring of one outcome over another
• Try not to confuse this definition with a
non-statistical definition
• Bias is enemy #1 for sampling technique
Some notation
• The lowercase script ‘n’ always denotes the
number of individuals in a sample
• The capital ‘N’ denotes the size of the
population
• ‘Table B’ (inside back cover) is the table of
random digits
• A random integer can be produced from a TI
with the command “RandInt(a, b, n)”
– a = smallest number, b = largest number,
n = number of digits to produce (optional)
Simple Random Samples
• This is THE sampling technique for this
statistics course
– Other sampling techniques exist, but our
course is focused on the results of an SRS
• Every possible sample of size n has an
equal chance of being selected
• This is analogous to placing “names in a
hat” or “drawing straws”
Choosing an SRS
1. Label Individuals
Assign each individual in the population a
unique “ID”
Each ID should have the same # of digits
2. Select Individuals
Use table B or your calculator to select
individuals
3. Stopping rule
Indicate when you will stop sampling
4. Identify Sample
Indicate which individuals/ID#’s are included
your sample
Probability Samples
• Samples are chosen by chance
• All possible samples are known
• The probability of choosing each sample
is known
• SRS is one example of a probability
sample
Stratified Random Sample
• Population is divided into strata
– These strata are segments of the population that are
similar in an important way
• Each stratum undergoes an SRS
• The samples from each stratum are combined to
form the full sample
• A stratified sample ensures that all groups are
represented at the appropriate proportion
– Would a sample that consists of 50% boys and 50%
girls make sense for a population of IT consultants?
Stratified Random Sample
Suppose the population contains 100
juniors and 50 seniors
• We would like our samples to reflect this
proportion between juniors and seniors
1. Choose an SRS n=10 from the juniors
2. Choose and SRS n=5 from the seniors
3. The 15 individuals chosen will be the
sample for our Stratified Random Sample
Cluster Sampling
1. The population is divided into clusters or
groups
Each cluster must be representative of
the population (no bias!)
2. One cluster is randomly chosen
Random ID selection (table B, names in a
hat, calculator)
3. The entire cluster that is chosen
becomes the sample
Multistage sampling
• Used when the population is very large
• Take samples from the samples
repeatedly until the sample size is
“manageable”
• Refer to pg 341
Cautions about Sample Surveys
Undercoverage
• Sample does not include all segments of the
population, or systematically favors one
segment of the population
• Many telephone samples will contain an
undercoverage bias simply because many
people do not have telephones
– (yes, it’s true)
• This is most serious when the
“undercovered” individuals differ
significantly from the rest of the population.
Cautions about Sample Surveys
Nonresponse
• Many people contacted for a survey choose not to
participate
• Extremely significant if the non-responders differ
from the responders
• Simply “sampling more people” will not eliminate
bias, esp. if the bias is systematically linked to
the nonresponse
– We are likely to get more nonresponse!
• We should either:
(1) redesign the survey, or
(2) follow up on the nonresponders
Cautions about Sample Surveys
Response Bias
• Respondents answer in a way that is
different from the actual opinion
• Can be caused by the interviewer
– Appearance and gender sensitive questions
can be influenced by the appearance and
gender of interviewer
Cautions about Sample Surveys
Wording of Questions
• Questions that are “confusing”
– Complicated wording affects responses
• Questions that are “leading”
– Present a scenario that can influence a
response before prompting for a response
– Use words that color the respondent's
opinions
Sample Survey Wisdom
• Insist of knowing the following before
trusting results:
1. The exact questions asked
2. Rate of nonresponse
3. Date and method of survey
• Larger samples produce more accurate
results than smaller samples
Assignment 5.1A
#2, 6, 7, 9, 11, 24, 26, 32
5.2 DESIGNING EXPERIMENTS
Definitions
An experiment is conducted to reveal the
response of one variable (response
variable) to changes in other variables
(explanatory variable/s)
Definitions
Experimental Units
• The individuals upon whom the
experiment is conducted
• Human experimental units are called
“subjects”
Treatment
• The specific experimental condition
applied to the experimental units
Definitions
Factors
• Another term for explanatory variables in
an experiment
• An experiment can examine the effects
of multiple factors
Levels
• Factors can be applied to experimental
units in different amounts or levels
Principles of Design
• Control
– Minimize effect confounding variables
– Obtain and apply treatments to exp. units
• Replication
– Minimize effects of outlier observations
– Use multiple exp units
• Randomization
– Minimize effects of variability from individual
responses
Control
• Try to detect and separate effects from the
treatment from effects from other variables
• Control Group
– Represents the population with no treatment
– Often applied a placebo treatment
– Provides a “baseline” for comparison
• Don’t confuse “Control” (the principle)
with “Control Group” (the treatment group)
Replication
• We would like exp. units within each
treatment group to respond similarly to the
treatment, and differently from exp. units
in other treatment groups
• BUT variability (and outliers) exists
throughout each treatment group
• If the experiment is replicated many times
(many exp. units), the effects of variability
(and outliers) with “average out”
Replication
• Use enough experimental units to
eliminate “chance variation”
• Replication (in terms of experimental
design) does not mean “repeat the entire
experiment”
• Remember: larger samples produce more
accurate results than smaller samples
Randomization
• Assign experimental units to treatments
using a randomized design (SRS)
• Minimize bias due to individual’s response
level to different treatments
Statistical Significance
• After experimentation, we hope to see a
difference in response level that is
large/measurable
• A difference that is too large to have
happened “by chance” is called statistically
significant
• We try to produce statistically significant
results!
• We will discuss how large the difference
must be in future chapters.
Randomized Comparative
Experiments
• Completely Randomized Design
– Most basic
• Block Design
– Used when we believe there is a difference
in response levels of different groups
• Matched Pairs Design
– Compares only two treatments
– Measures effect of treatment on two very
similar exp units
Completely Randomized Design
• Can be used for many treatments
• Exp units assigned to treatment group
randomly
• Response in each treatment group is
averaged
• Average of each treatment group is
compared
Completely Randomized Design
(Example Diagram)
Block Design
• This is an instance of control
• Exp Units are known to be have similar
response level groups (i.e. gender
differences)
• Exp units are “blocked” according to
these groups
• Each block undergoes an SRS into
treatment groups
Block Design
• Each treatment group is averaged an
compared within the block
• Each block may (or may not) have a
control group
• Form blocks based on the most
important unavoidable sources of
variability among exp units
• “Control what you can, block what you
can’t control, randomize the rest”
Block Design
(Example Diagram)
Matched Pairs Design
• Exp units are matched into pairs that are
similar in terms of the experiment
• Each of two experimental units will
receive a different treatment
• Many times, the subjects in the pair are
the same person
• The effect of the response from the
matched pair is measured with a simple
subtraction
Matched Pairs Design
• Randomization-
– Randomized which member of the pair
receives which treatment
– Randomize the order the treatments are
applied
– Often randomization can be done with a coin
flip!
– Sometimes, it is important to have a length
of time between treatment applications
Matched Pair Design
(example diagram – single subject)
Subject #1
treatment
control
compare
Subject #2
control
treatment
compare
Subject #3
treatment
control
compare
Subject #n
control
treatment
compare
Randomize
order
compare
Matched Pair Design
(example diagram – paired subjects)
Subject #1 treatment
control compareSubject #2
Subject #3 treatment
control
compare
Subject #n
treatment
control
compare
Randomize
treatment
compareSubject #4
Subject #n-1
Match Pairs
Cautions about Experimentation
Double Blind Experiment
• Sometimes bias is produced unconsciously
• Sometimes a subject will produce bias if he
knows he as receiving placebo treatment
• Effects can be controlled if neither the
experimenter not the subject know which
treatment was administered
• Typically, the treatment is given an ID number
and only the researcher will know which
treatment corresponds to which ID.
• Controls the placebo effect
Cautions about Experimentation
Lack of realism
• Experimental results are produced under
conditions that cannot be realistically
duplicated
• Subjects who know they are exp units
may behave differently than the
population
• The laboratory setting itself may be a
variable of the experiment!

Más contenido relacionado

La actualidad más candente

t distribution, paired and unpaired t-test
t distribution, paired and unpaired t-testt distribution, paired and unpaired t-test
t distribution, paired and unpaired t-testBPKIHS
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testShakehand with Life
 
Introduction to Hypothesis Testing
Introduction to Hypothesis TestingIntroduction to Hypothesis Testing
Introduction to Hypothesis Testingjasondroesch
 
Z test
Z testZ test
Z testkagil
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpointjamiebrandon
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independencejasondroesch
 
2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyene2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyenergveroniki
 
Inferential statistics (2)
Inferential statistics (2)Inferential statistics (2)
Inferential statistics (2)rajnulada
 
Ethics In Research
Ethics In ResearchEthics In Research
Ethics In ResearchGrant Heller
 
2010 smg training_cardiff_day2_session2_dias
2010 smg training_cardiff_day2_session2_dias2010 smg training_cardiff_day2_session2_dias
2010 smg training_cardiff_day2_session2_diasrgveroniki
 

La actualidad más candente (20)

Z-test
Z-testZ-test
Z-test
 
Tests of significance z & t test
Tests of significance z & t testTests of significance z & t test
Tests of significance z & t test
 
t distribution, paired and unpaired t-test
t distribution, paired and unpaired t-testt distribution, paired and unpaired t-test
t distribution, paired and unpaired t-test
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 
Introduction to Hypothesis Testing
Introduction to Hypothesis TestingIntroduction to Hypothesis Testing
Introduction to Hypothesis Testing
 
Z test
Z testZ test
Z test
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
 
2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyene2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyene
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Fishers test
Fishers testFishers test
Fishers test
 
Parametric vs Non-Parametric
Parametric vs Non-ParametricParametric vs Non-Parametric
Parametric vs Non-Parametric
 
Inferential statistics (2)
Inferential statistics (2)Inferential statistics (2)
Inferential statistics (2)
 
Significance test
Significance testSignificance test
Significance test
 
Student's T-Test
Student's T-TestStudent's T-Test
Student's T-Test
 
Ethics In Research
Ethics In ResearchEthics In Research
Ethics In Research
 
Errors and types
Errors and typesErrors and types
Errors and types
 
Sampling design
Sampling designSampling design
Sampling design
 
2010 smg training_cardiff_day2_session2_dias
2010 smg training_cardiff_day2_session2_dias2010 smg training_cardiff_day2_session2_dias
2010 smg training_cardiff_day2_session2_dias
 
Statistical test
Statistical testStatistical test
Statistical test
 

Similar a Producing Data Chapter Experimental Design

4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdf4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdfBijayThapa30
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalIstiqlalEid
 
Sampling:Medical Statistics Part III
Sampling:Medical Statistics Part IIISampling:Medical Statistics Part III
Sampling:Medical Statistics Part IIIRamachandra Barik
 
Sampling Techniques.pptx
Sampling Techniques.pptxSampling Techniques.pptx
Sampling Techniques.pptxMostaque Ahmed
 
Introduction to experimental designsPH2600 2019Neil O’
Introduction to experimental designsPH2600 2019Neil O’Introduction to experimental designsPH2600 2019Neil O’
Introduction to experimental designsPH2600 2019Neil O’TatianaMajor22
 
FCS 681 Lecture 5SamplingWhat is sampling and Wh.docx
FCS 681 Lecture 5SamplingWhat is sampling and Wh.docxFCS 681 Lecture 5SamplingWhat is sampling and Wh.docx
FCS 681 Lecture 5SamplingWhat is sampling and Wh.docxmydrynan
 
Data Sampling Methods in Healthcare
Data Sampling Methods in Healthcare Data Sampling Methods in Healthcare
Data Sampling Methods in Healthcare kiran
 
Survey Methods - OIISDP 2015
Survey Methods - OIISDP 2015Survey Methods - OIISDP 2015
Survey Methods - OIISDP 2015Rey Junco
 
Sampling designs in operational health research
Sampling designs in operational health researchSampling designs in operational health research
Sampling designs in operational health researchirfan ali
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsGillian Byrne
 
Lecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, AnywayLecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, AnywayJason Edington
 
SamplingBigSlides.pdf
SamplingBigSlides.pdfSamplingBigSlides.pdf
SamplingBigSlides.pdfifuchfuhg
 
unit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.pptunit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.pptMitikuTeka1
 

Similar a Producing Data Chapter Experimental Design (20)

4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdf4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdf
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
 
Sampling:Medical Statistics Part III
Sampling:Medical Statistics Part IIISampling:Medical Statistics Part III
Sampling:Medical Statistics Part III
 
Sampling
SamplingSampling
Sampling
 
Sampling
SamplingSampling
Sampling
 
Sampling slides
Sampling slidesSampling slides
Sampling slides
 
Sampling bigslides
Sampling bigslidesSampling bigslides
Sampling bigslides
 
Sampling Techniques.pptx
Sampling Techniques.pptxSampling Techniques.pptx
Sampling Techniques.pptx
 
5.Sampling_Techniques.pptx
5.Sampling_Techniques.pptx5.Sampling_Techniques.pptx
5.Sampling_Techniques.pptx
 
Introduction to experimental designsPH2600 2019Neil O’
Introduction to experimental designsPH2600 2019Neil O’Introduction to experimental designsPH2600 2019Neil O’
Introduction to experimental designsPH2600 2019Neil O’
 
Sampling
SamplingSampling
Sampling
 
FCS 681 Lecture 5SamplingWhat is sampling and Wh.docx
FCS 681 Lecture 5SamplingWhat is sampling and Wh.docxFCS 681 Lecture 5SamplingWhat is sampling and Wh.docx
FCS 681 Lecture 5SamplingWhat is sampling and Wh.docx
 
Data Sampling Methods in Healthcare
Data Sampling Methods in Healthcare Data Sampling Methods in Healthcare
Data Sampling Methods in Healthcare
 
Survey Methods - OIISDP 2015
Survey Methods - OIISDP 2015Survey Methods - OIISDP 2015
Survey Methods - OIISDP 2015
 
Sampling designs in operational health research
Sampling designs in operational health researchSampling designs in operational health research
Sampling designs in operational health research
 
Research I & III.pptx
Research I & III.pptxResearch I & III.pptx
Research I & III.pptx
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statistics
 
Lecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, AnywayLecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, Anyway
 
SamplingBigSlides.pdf
SamplingBigSlides.pdfSamplingBigSlides.pdf
SamplingBigSlides.pdf
 
unit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.pptunit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.ppt
 

Más de Richard Ferreria (20)

Chapter6
Chapter6Chapter6
Chapter6
 
Chapter2
Chapter2Chapter2
Chapter2
 
Chapter3
Chapter3Chapter3
Chapter3
 
Chapter8
Chapter8Chapter8
Chapter8
 
Chapter1
Chapter1Chapter1
Chapter1
 
Chapter4
Chapter4Chapter4
Chapter4
 
Chapter7
Chapter7Chapter7
Chapter7
 
Chapter5
Chapter5Chapter5
Chapter5
 
Chapter9
Chapter9Chapter9
Chapter9
 
Chapter14
Chapter14Chapter14
Chapter14
 
Chapter15
Chapter15Chapter15
Chapter15
 
Chapter11
Chapter11Chapter11
Chapter11
 
Chapter12
Chapter12Chapter12
Chapter12
 
Chapter10
Chapter10Chapter10
Chapter10
 
Chapter13
Chapter13Chapter13
Chapter13
 
Adding grades to your google site v2 (dropbox)
Adding grades to your google site v2 (dropbox)Adding grades to your google site v2 (dropbox)
Adding grades to your google site v2 (dropbox)
 
Stats chapter 14
Stats chapter 14Stats chapter 14
Stats chapter 14
 
Stats chapter 15
Stats chapter 15Stats chapter 15
Stats chapter 15
 
Stats chapter 13
Stats chapter 13Stats chapter 13
Stats chapter 13
 
Stats chapter 12
Stats chapter 12Stats chapter 12
Stats chapter 12
 

Producing Data Chapter Experimental Design

  • 3. Some notes before we begin • We are entering the second part of the statistics course “Experimental Design” • In most real life applications, experimental design begins the process of statistics • Provided experiments (and surveys) are carefully designed, we can use the techniques of statistics to analyze the results with increased “significance” • Much of this material is covered in social science courses (i.e. psychology)
  • 4. Population and Sample Population- • The entire group of individuals for which information is produced Sample- • A subset of the population that is examined in greater detail • Results of the sample are generalized to the population.
  • 5. Sample vs. Census Census • Information gathered from the entire population (no exceptions!) • Produces the most accurate description of the population • Usually expensive or impossible
  • 6. Samples • By their nature, the success or failure of a study or experiment depends on good technique in sampling • We want our sample to “look like” our population – We would like to minimize the effect of outlier observations – We would like to decrease ‘variability’ in our sample – We would like to decrease ‘bias’
  • 7. Some ‘bad’ sampling techniques Voluntary Response Sampling • Most often seen as a ‘call-in’ poll or an ‘internet poll’ • People with strong, often negative opinions are most likely to respond • Polls are easily “fixed” • This sampling technique and its’ results are not to be trusted!
  • 8. Some ‘bad’ sampling techniques Convenience Sampling • Individuals in the sample consist of those who are easiest to reach • Mall interviews – The sample is only valid for people who visit the mall (this is not everyone!) – The sample tends to consist of the “easiest targets” • Some telephone studies • This is not to say that samples must be difficult to construct, they just cannot consist of only the easiest individuals to sample
  • 9. Bias • In statistics, bias refers to the systematic favoring of one outcome over another • Try not to confuse this definition with a non-statistical definition • Bias is enemy #1 for sampling technique
  • 10. Some notation • The lowercase script ‘n’ always denotes the number of individuals in a sample • The capital ‘N’ denotes the size of the population • ‘Table B’ (inside back cover) is the table of random digits • A random integer can be produced from a TI with the command “RandInt(a, b, n)” – a = smallest number, b = largest number, n = number of digits to produce (optional)
  • 11. Simple Random Samples • This is THE sampling technique for this statistics course – Other sampling techniques exist, but our course is focused on the results of an SRS • Every possible sample of size n has an equal chance of being selected • This is analogous to placing “names in a hat” or “drawing straws”
  • 12. Choosing an SRS 1. Label Individuals Assign each individual in the population a unique “ID” Each ID should have the same # of digits 2. Select Individuals Use table B or your calculator to select individuals 3. Stopping rule Indicate when you will stop sampling 4. Identify Sample Indicate which individuals/ID#’s are included your sample
  • 13. Probability Samples • Samples are chosen by chance • All possible samples are known • The probability of choosing each sample is known • SRS is one example of a probability sample
  • 14. Stratified Random Sample • Population is divided into strata – These strata are segments of the population that are similar in an important way • Each stratum undergoes an SRS • The samples from each stratum are combined to form the full sample • A stratified sample ensures that all groups are represented at the appropriate proportion – Would a sample that consists of 50% boys and 50% girls make sense for a population of IT consultants?
  • 15. Stratified Random Sample Suppose the population contains 100 juniors and 50 seniors • We would like our samples to reflect this proportion between juniors and seniors 1. Choose an SRS n=10 from the juniors 2. Choose and SRS n=5 from the seniors 3. The 15 individuals chosen will be the sample for our Stratified Random Sample
  • 16. Cluster Sampling 1. The population is divided into clusters or groups Each cluster must be representative of the population (no bias!) 2. One cluster is randomly chosen Random ID selection (table B, names in a hat, calculator) 3. The entire cluster that is chosen becomes the sample
  • 17. Multistage sampling • Used when the population is very large • Take samples from the samples repeatedly until the sample size is “manageable” • Refer to pg 341
  • 18. Cautions about Sample Surveys Undercoverage • Sample does not include all segments of the population, or systematically favors one segment of the population • Many telephone samples will contain an undercoverage bias simply because many people do not have telephones – (yes, it’s true) • This is most serious when the “undercovered” individuals differ significantly from the rest of the population.
  • 19. Cautions about Sample Surveys Nonresponse • Many people contacted for a survey choose not to participate • Extremely significant if the non-responders differ from the responders • Simply “sampling more people” will not eliminate bias, esp. if the bias is systematically linked to the nonresponse – We are likely to get more nonresponse! • We should either: (1) redesign the survey, or (2) follow up on the nonresponders
  • 20. Cautions about Sample Surveys Response Bias • Respondents answer in a way that is different from the actual opinion • Can be caused by the interviewer – Appearance and gender sensitive questions can be influenced by the appearance and gender of interviewer
  • 21. Cautions about Sample Surveys Wording of Questions • Questions that are “confusing” – Complicated wording affects responses • Questions that are “leading” – Present a scenario that can influence a response before prompting for a response – Use words that color the respondent's opinions
  • 22. Sample Survey Wisdom • Insist of knowing the following before trusting results: 1. The exact questions asked 2. Rate of nonresponse 3. Date and method of survey • Larger samples produce more accurate results than smaller samples
  • 23. Assignment 5.1A #2, 6, 7, 9, 11, 24, 26, 32
  • 25. Definitions An experiment is conducted to reveal the response of one variable (response variable) to changes in other variables (explanatory variable/s)
  • 26. Definitions Experimental Units • The individuals upon whom the experiment is conducted • Human experimental units are called “subjects” Treatment • The specific experimental condition applied to the experimental units
  • 27. Definitions Factors • Another term for explanatory variables in an experiment • An experiment can examine the effects of multiple factors Levels • Factors can be applied to experimental units in different amounts or levels
  • 28. Principles of Design • Control – Minimize effect confounding variables – Obtain and apply treatments to exp. units • Replication – Minimize effects of outlier observations – Use multiple exp units • Randomization – Minimize effects of variability from individual responses
  • 29. Control • Try to detect and separate effects from the treatment from effects from other variables • Control Group – Represents the population with no treatment – Often applied a placebo treatment – Provides a “baseline” for comparison • Don’t confuse “Control” (the principle) with “Control Group” (the treatment group)
  • 30. Replication • We would like exp. units within each treatment group to respond similarly to the treatment, and differently from exp. units in other treatment groups • BUT variability (and outliers) exists throughout each treatment group • If the experiment is replicated many times (many exp. units), the effects of variability (and outliers) with “average out”
  • 31. Replication • Use enough experimental units to eliminate “chance variation” • Replication (in terms of experimental design) does not mean “repeat the entire experiment” • Remember: larger samples produce more accurate results than smaller samples
  • 32. Randomization • Assign experimental units to treatments using a randomized design (SRS) • Minimize bias due to individual’s response level to different treatments
  • 33. Statistical Significance • After experimentation, we hope to see a difference in response level that is large/measurable • A difference that is too large to have happened “by chance” is called statistically significant • We try to produce statistically significant results! • We will discuss how large the difference must be in future chapters.
  • 34. Randomized Comparative Experiments • Completely Randomized Design – Most basic • Block Design – Used when we believe there is a difference in response levels of different groups • Matched Pairs Design – Compares only two treatments – Measures effect of treatment on two very similar exp units
  • 35. Completely Randomized Design • Can be used for many treatments • Exp units assigned to treatment group randomly • Response in each treatment group is averaged • Average of each treatment group is compared
  • 37. Block Design • This is an instance of control • Exp Units are known to be have similar response level groups (i.e. gender differences) • Exp units are “blocked” according to these groups • Each block undergoes an SRS into treatment groups
  • 38. Block Design • Each treatment group is averaged an compared within the block • Each block may (or may not) have a control group • Form blocks based on the most important unavoidable sources of variability among exp units • “Control what you can, block what you can’t control, randomize the rest”
  • 40. Matched Pairs Design • Exp units are matched into pairs that are similar in terms of the experiment • Each of two experimental units will receive a different treatment • Many times, the subjects in the pair are the same person • The effect of the response from the matched pair is measured with a simple subtraction
  • 41. Matched Pairs Design • Randomization- – Randomized which member of the pair receives which treatment – Randomize the order the treatments are applied – Often randomization can be done with a coin flip! – Sometimes, it is important to have a length of time between treatment applications
  • 42. Matched Pair Design (example diagram – single subject) Subject #1 treatment control compare Subject #2 control treatment compare Subject #3 treatment control compare Subject #n control treatment compare Randomize order compare
  • 43. Matched Pair Design (example diagram – paired subjects) Subject #1 treatment control compareSubject #2 Subject #3 treatment control compare Subject #n treatment control compare Randomize treatment compareSubject #4 Subject #n-1 Match Pairs
  • 44. Cautions about Experimentation Double Blind Experiment • Sometimes bias is produced unconsciously • Sometimes a subject will produce bias if he knows he as receiving placebo treatment • Effects can be controlled if neither the experimenter not the subject know which treatment was administered • Typically, the treatment is given an ID number and only the researcher will know which treatment corresponds to which ID. • Controls the placebo effect
  • 45. Cautions about Experimentation Lack of realism • Experimental results are produced under conditions that cannot be realistically duplicated • Subjects who know they are exp units may behave differently than the population • The laboratory setting itself may be a variable of the experiment!