SlideShare una empresa de Scribd logo
1 de 35
1
Research Methods in Health
Chapter 8. Statistical Methods 2 ANOVA
Young Moon Chae, Ph.D.
Graduate School of Public Health
Yonsei University, Korea
ymchae@yuhs.ac
2
Table of Contents
• One way ANOVA
• Multiple comparison
• Repeated measure ANOVA
• ANCOVA
3
One-Way ANOVA
4
ANOVA
When to use it
• Analysis of variance (ANOVA) is the most commonly used technique for comparing
the means of groups of measurement data. There are lots of different experimental
designs that can be analyzed with different kinds of ANOVA
• In a one-way ANOVA (also known as a single-classification ANOVA), there is one
measurement variable and one nominal variable.
Null hypothesis
• The statistical null hypothesis is that the means of the measurement variable are the
same for the different categories of data; the alternative hypothesis is that they are
not all the same.
5
Rationale for ANOVA (1)
• We have at least 3 means to test, e.g., H0: m1 = m2 = m3.
• Could take them 2 at a time, but really want to test all 3 (or more) at once.
• Instead of using a mean difference, we can use the variance of the group
means about the grand mean over all groups.
• Logic is just the same as for the t-test. Compare the observed variance
among means (observed difference in means in the t-test) to what we would
expect to get by chance.
6
ANOVA Assumptions
• Data in each group are a random sample from some population.
• Observations within groups are independent.
• Samples are independent.
• Underlying populations normally distributed.
• Underlying populations have the same variance – This can formally tested
with Bartlett’s test
• What might happen (why would it be a problem) if the assumption of
{normality, equality of error, independence of error} turned out to be false?
-> Use non-parametric statistics or use data transformation
7
Multiple comparisons
• When we carry out an ANOVA on k treatments, we test
H0 : μ1 = · · · = μk versus Ha : H0 is false
• Assume we reject the null hypothesis, i.e. we have some evidence that not
all treatment means are equal. Then we could for example be interested in
which ones are the same, and which ones differ.
• For this, we might have to carry out some more hypothesis tests.
• This procedure is referred to as multiple comparisons.
8
Types of multiple comparisons
• There are two different types of multiple comparisons procedures:
• Sometimes we already know in advance what questions we want to answer.
Those comparisons are called planned (or a priori) comparisons.
• Sometimes we do not know in advance what questions we want to answer,
and the judgment about which group means will be studied the same
depends on the ANOVA outcome. Those comparisons are called unplanned
(or a posteriori) comparisons.
-Planned comparisons: adjust for just those tests that are planned.
-Unplanned comparisons: adjust for all possible comparisons.
9
Independence of planned comparison
å =
j
jjcc 021
Comparison A1 A2 A3 A4
1 -1/3 1 -1/3 -1/3
2 -1/2 0 -1/2 1
3 1/2 1/2 -1/2 -1/2
0)1*3/1()2/1*3/1(
)0*1()2/1*3/1(21
=-+--
++--=åj
jj cc
3/26/4)2/1*3/1()2/1*3/1(
)2/1*1()2/1*3/1(31
==--+--
++-=åj
jj cc
One and two are orthogonal; one and three are not
There are J-1 orthogonal comparisons. Use only what you need.
.
10
Example 1
• We previously investigated whether the mean blood coagulation times for
animals receiving different diets (A, B, C or D) were the same.
• Imagine A is the standard diet, and we wish to compare each of diets B, C,
D to diet A.
→ planned comparisons!
• After inspecting the treatment means, we find that A and D look similar, and
B and C look similar, but A and D are quite different from B and C. We
might want to formally test the hypothesis
→ unplanned comparisons!
11
Example 2
• A plant physiologist recorded the length of pea sections grown in tissue
culture with auxin present. The purpose of the experiment was to
investigate the effects of various sugars on growth. Four different treatments
were used, plus one control (no sugar):
No sugar
2% glucose
2% fructose
1% glucose + 1% fructose
2% sucrose
• The investigator wants to answer three specific questions:
- Does the addition of sugars have an effect on the lengths of the pea sections?
- Are there differences between the pure sugar treatments and the mixed sugar
treatment?
- Are there differences among the pure sugar treatments? Planned comparisons!
12
ANOVA Table
13
Bonferroni Correction
• Suppose we have 10 treatment groups, and so 45 pairs.
• If we perform 45 t-tests at the significance level = 0.05, we’d expect to
reject 5% × 45 ≈ 2 of them, even if all of the means were the same.
• Let = Pr(reject at least one pairwise test | all μ’s the same)
≤ (no. tests) × Pr(reject test #1 | μ’s the same)
• The Bonferroni correction:
Use ′ = /(no. tests) as the significance level for each test.
• For example, with 10 groups and so 45 pairwise tests,
we’d use ′ = 0.05 / 45 ≈ 0.0011 for each test.
14
Post Hoc Tests
• Given a significant F, where are the mean differences?
• Often do not have planned comparisons.
• Usually compare pairs of means.
• There are many methods of post hoc (after the fact) tests.
15
Scheffé
• Can use for any contrast. Follows same calculations, but uses different
critical values.
• Instead of comparing the test statistic to a critical value of t, use:
)ˆvar(.
ˆ
y
y
est
t =
aFJ )1(S -=
Where the F comes from the overall F test (J-1 and N-J df).
16
Scheffé (2)
Source SS df MS F
Cells (A1-
A4)
219 3 73 12.17
Error 72 12 6
Total 291 15
5.3)22*5(.)28*5(.
)25*5(.)18*5(.ˆ1
-=--
+=y
2/3
4
)5.()5.(5.5.
6)ˆ(.
2222
=
-+-++
=yVarest
86.2
2247.1
5.3
)ˆvar(.
ˆ
-=
-
==
y
y
est
t
24.349.3)14()1(S =-=-= aFJ
49.3)12,3,05.( ==aF
(Data from earlier problem.)
The comparison is not significant because |-2.86|<3.24.
17
Paired comparisons
•Newman Keuls and Tukey HSD depend on q, the studentized range
statistic.
• Suppose we have J independent sample means and we find the
largest and the smallest.
nMS
yy
q
error /
minmax -
=
MS error comes from the ANOVA we did to get
the J means. The n refers to sample size per
cell. If two cells are unequal, use
2n1n2/(n1+n2).
The sampling distribution of q depends on k, the number of means
covered by the range (max-min), and on v, the degrees of freedom
for MSerror.
18
Tukey HSD
HSD = honestly significant difference.
For HSD, use k = J, the number of groups in the study. Choose
alpha, and find the df for error. Look up the value qα. Then find
the value:
n
MS
qHSD error
a=
Compare HSD to the absolute value of the difference between all
pairs of means. Any difference larger than HSD is significant.
19
HSD 2
Grp -> 1 2 3 4 5
M -> 63 82 80 77 70
Source SS df MS F p
Grps 2942.4 4 725.6 4.13 <.05
Error 9801.0 55 178.2
K = 5 groups; n=12 per group, v has 55 df. Tabled value of q with alpha =.05 is 3.98.
34.15
12
2.178
98.3 ===
n
MS
qHSD error
a
Group 1 5 4 3 2
1 63 0 7 14 17* 19*
5 70 0 7 10 12
4 77 0 3 5
3 80 0 2
2 82 0
20
Comparing Post Hoc Tests
•The Newman-Keuls found 3 significant differences in our example. The
HSD found 2 differences.
•If we had used the Bonferroni approach, we would have found an
interval of 15.91 required for significance (and therefore the same two
significant as HSD). Thus, power descends from the Newman-Keuls to
the HSD to the Bonferroni.
• The type I error rates go just the opposite, the lowest to Bonferroni, then
HSD and finally Newman-Keuls. Do you want to be liberal or
conservative in your choice of tests? Type I error vs Power.
21
Repeated Measures ANOVA
When to Use Repeated Measures ANOVA
• Repeated measures ANOVA is used when all members of a random sample are
measured under a number of different conditions. As the sample is exposed to each
condition in turn, the measurement of the dependent variable is repeated. Using a
standard ANOVA in this case is not appropriate because the data violate the ANOVA
assumption of independence.
• This approach is used for several reasons.
- Some research hypotheses require repeated measures. Longitudinal research, for
example, measures each sample member at each of several ages. In this case, age
would be a repeated factor.
- When sample members are difficult to recruit, repeated measures designs are
economical because each member is measured under all conditions.
22
Statistical Terminology Used in this Document
• A sample member is called a subject.
• When a dependent variable is measured repeatedly for all sample members
across a set of conditions, this set of conditions is called a within-subjects
factor. The conditions that constitute this type of factor are called trials.
• When a dependent variable is measured on independent groups of sample
members, where each group is exposed to a different condition, the set of
conditions is called a between-subjects factor. The conditions that
constitute this factor type are called groups.
• When an analysis has both within-subjects factors and between subjects
factors, it is called a repeated measures ANOVA with between-subjects
factors
23
Example
• Suppose that, as a health researcher, you want to examine the impact of dietary
habit and exercise on pulse rate. To investigate these issues, you collect a sample
of individuals and group them according to their dietary preferences: meat eaters
and vegetarians. You then divide each diet category into three groups, randomly
assigning each group to one of three types of exercise: aerobic stair climbing,
racquetball, and weight training. So far, then, your design has two between-
subjects (grouping) factors: dietary preference and exercise type.
• Suppose that, in addition to these between-subjects factors, you want to include a
single within-subjects factor in the analysis. Each subject's pulse rate will be
measured at three levels of exertion: after warm-up exercises, after jogging, and
after running. Thus, intensity (of exertion) is the within-subjects factor in this
design. The order of these three measurements will be randomly assigned for each
subject
24
Research Questions :
Within-Subjects Main Effect
• Does intensity influence pulse rate? (Does mean pulse rate change across the trials
for intensity?) This is the test for a within-subjects main effect of intensity.
Between-Subjects Main Effects
• Does dietary preference influence pulse rate? (Do vegetarians have different mean
pulse rates than meat eaters?) This is the test for a between-subjects main effect of
dietary preference.
• Does exercise type influence pulse rate? (Are there differences in mean pulse rates
between stair climbers, racquetball players, and weight trainers?) This is the test for a
between-subjects main effect of exercise type.
Between-Subjects Interaction Effect
• Does the influence of exercise type on pulse rate depend on dietary preference?
(Does the pattern of differences between mean pulse rates for exercise-type groups
change for each dietary-preference group?) This is the test for a between-subjects
interaction of exercise type by dietary preference.
25
Results
• Diet: With a p value less than .0001, you have a statistically significant effect. You
can therefore conclude that a statistically significant difference exists between
vegetarians and meat eaters on their overall pulse rates. In other words, there is
a main effect for diet. The cell means (not shown here) show that meat eaters
experience higher pulse rates than vegetarians.
• Exercise type: It is non-significant: F(2, 144) = .31, p=.7341. Thus, you can conclude
that the type of exercise has no statistically significant effect on overall mean
pulse rates.
• The test of the DIET BY EXERTYPE interaction also shows a non-significant result
(F(2, 144) = .52, p=.594). This suggests that dietary preferences and type of
exercise do not combine to influence the overall average pulse rate.
• When an interaction effect is significant, the pattern of cell means must be examined
to determine the meaning not only of the interaction, but also the meaning of any
main effects involved in the interaction.
26
Analyzing Continuous and Categorical IVs
Simultaneously
Analysis of Covariance
27
(cont.)
When to use it
• The purpose of ANCOVA is to compare two or more linear regression lines.
It is a way of comparing the Y variable among groups while statistically
controlling for variation in Y caused by variation in the X variable.
Null hypotheses
• Two null hypotheses are tested in an ANCOVA. The first is that the slopes
of the regression lines are all the same. If this hypothesis is not rejected, the
second null hypothesis is tested: that the Y-intercepts of the regression lines
are all the same.
• Although the most common use of ANCOVA is for comparing two
regression lines, it is possible to compare three or more regressions. If their
slopes are all the same, it is then possible to do planned or unplanned
comparisons of Y-intercepts, similar to the planned or unplanned
comparisons of means in an ANOVA
28
ANCOVA (GLM): Example
The General Linear Model (GLM) approach is used to ANCOVA to
determine whether MCAT scores are significantly different among
medical students who had different types of undergraduate majors,
when adjusted for year of matriculation.
29
• Dependent variable
§ nmtot1: MCAT total (most recent)
• Fixed factor
§ bmaj2: Undergraduate major
1 = Biology/Chemistry
2 = Other science/health
3 = Other
• Covariate
§ matyr: Year of matriculation
ANCOVA (GLM): cont.
30
ANCOVA (GLM) Output: :
Descriptive Statistics
31
ANCOVA (GLM) Output: :
Descriptive Statistics (continued)
32
ANCOVA (GLM) Output:
Tests of Between-Subjects Effects
33
ANCOVA (GLM):
Estimated Marginal Means: Undergraduate Major
34
ANCOVA (GLM):
Post Hoc Tests: Undergraduate Major
35
ANCOVA (GLM):
Profile Plot: Gender*Resloc

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Two sample t-test
Two sample t-testTwo sample t-test
Two sample t-test
 
Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Introduction to t-tests (statistics)
Introduction to t-tests (statistics)
 
Anova lecture
Anova lectureAnova lecture
Anova lecture
 
t-test vs ANOVA
t-test vs ANOVAt-test vs ANOVA
t-test vs ANOVA
 
Chi – square test
Chi – square testChi – square test
Chi – square test
 
The Chi Square Test
The Chi Square TestThe Chi Square Test
The Chi Square Test
 
ANOVA.pdf
ANOVA.pdfANOVA.pdf
ANOVA.pdf
 
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
7 anova chi square test
 7 anova chi square test 7 anova chi square test
7 anova chi square test
 
Analysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowAnalysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to know
 
Factorial ANOVA
Factorial ANOVAFactorial ANOVA
Factorial ANOVA
 
Ducan’s multiple range test - - Dr. Manu Melwin Joy - School of Management St...
Ducan’s multiple range test - - Dr. Manu Melwin Joy - School of Management St...Ducan’s multiple range test - - Dr. Manu Melwin Joy - School of Management St...
Ducan’s multiple range test - - Dr. Manu Melwin Joy - School of Management St...
 
Manova
ManovaManova
Manova
 
Anova (f test) and mean differentiation
Anova (f test) and mean differentiationAnova (f test) and mean differentiation
Anova (f test) and mean differentiation
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
 
Manova
ManovaManova
Manova
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 

Similar a Research method ch08 statistical methods 2 anova

ANOVA_PDF.pdf biostatistics course material
ANOVA_PDF.pdf biostatistics course materialANOVA_PDF.pdf biostatistics course material
ANOVA_PDF.pdf biostatistics course material
AmanuelIbrahim
 
Applied statistics lecture_8
Applied statistics lecture_8Applied statistics lecture_8
Applied statistics lecture_8
Daria Bogdanova
 

Similar a Research method ch08 statistical methods 2 anova (20)

1 ANOVA.ppt
1 ANOVA.ppt1 ANOVA.ppt
1 ANOVA.ppt
 
univariate and bivariate analysis in spss
univariate and bivariate analysis in spss univariate and bivariate analysis in spss
univariate and bivariate analysis in spss
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
tests of significance
tests of significancetests of significance
tests of significance
 
Statistics using SPSS
Statistics using SPSSStatistics using SPSS
Statistics using SPSS
 
Shovan anova main
Shovan anova mainShovan anova main
Shovan anova main
 
mean comparison.pptx
mean comparison.pptxmean comparison.pptx
mean comparison.pptx
 
mean comparison.pptx
mean comparison.pptxmean comparison.pptx
mean comparison.pptx
 
Statistics for Anaesthesiologists
Statistics for AnaesthesiologistsStatistics for Anaesthesiologists
Statistics for Anaesthesiologists
 
HFS3283 paired t tes-t and anova
HFS3283 paired t tes-t and anovaHFS3283 paired t tes-t and anova
HFS3283 paired t tes-t and anova
 
Bus 173_4.pptx
Bus 173_4.pptxBus 173_4.pptx
Bus 173_4.pptx
 
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
Parametric test  - t Test, ANOVA, ANCOVA, MANOVAParametric test  - t Test, ANOVA, ANCOVA, MANOVA
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
 
Epidemiological study design and it's significance
Epidemiological study design and it's significanceEpidemiological study design and it's significance
Epidemiological study design and it's significance
 
ANOVA 2023 aa 2564896.pptx
ANOVA 2023  aa 2564896.pptxANOVA 2023  aa 2564896.pptx
ANOVA 2023 aa 2564896.pptx
 
Chemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptx
Chemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptxChemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptx
Chemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptx
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptx
 
Analysis of Variance
Analysis of VarianceAnalysis of Variance
Analysis of Variance
 
ANOVA_PDF.pdf biostatistics course material
ANOVA_PDF.pdf biostatistics course materialANOVA_PDF.pdf biostatistics course material
ANOVA_PDF.pdf biostatistics course material
 
Applied statistics lecture_8
Applied statistics lecture_8Applied statistics lecture_8
Applied statistics lecture_8
 

Más de naranbatn

эрүүл мэндийн шинжлэх ухаан
эрүүл мэндийн шинжлэх ухаанэрүүл мэндийн шинжлэх ухаан
эрүүл мэндийн шинжлэх ухаан
naranbatn
 
Instructions to authors
Instructions to authorsInstructions to authors
Instructions to authors
naranbatn
 
Төгсөлтийн сургалтыг зохицуулах журам
Төгсөлтийн сургалтыг зохицуулах журамТөгсөлтийн сургалтыг зохицуулах журам
Төгсөлтийн сургалтыг зохицуулах журам
naranbatn
 
хичээлийн хуваарь 2011-2012 1-р улирал
хичээлийн хуваарь 2011-2012 1-р улиралхичээлийн хуваарь 2011-2012 1-р улирал
хичээлийн хуваарь 2011-2012 1-р улирал
naranbatn
 
д.амарсайхан захирал
д.амарсайхан захиралд.амарсайхан захирал
д.амарсайхан захирал
naranbatn
 
ц.лхагвасүрэн захирал
ц.лхагвасүрэн захиралц.лхагвасүрэн захирал
ц.лхагвасүрэн захирал
naranbatn
 
Self eval report english final for printing, 28.09.2011last for printing
Self eval report english final for printing, 28.09.2011last for printingSelf eval report english final for printing, 28.09.2011last for printing
Self eval report english final for printing, 28.09.2011last for printing
naranbatn
 
Бүрдүүлэх материал
Бүрдүүлэх материалБүрдүүлэх материал
Бүрдүүлэх материал
naranbatn
 
Health sciences university of mongolia
Health sciences university of mongoliaHealth sciences university of mongolia
Health sciences university of mongolia
naranbatn
 
Магистрын ганцаарчилсан сургалтын төлөвлөгөө
Магистрын ганцаарчилсан сургалтын төлөвлөгөөМагистрын ганцаарчилсан сургалтын төлөвлөгөө
Магистрын ганцаарчилсан сургалтын төлөвлөгөө
naranbatn
 
Germany summer school-2010
Germany summer school-2010Germany summer school-2010
Germany summer school-2010
naranbatn
 
Uni sannio courses
Uni sannio coursesUni sannio courses
Uni sannio courses
naranbatn
 
Germany summer school-2010
Germany summer school-2010Germany summer school-2010
Germany summer school-2010
naranbatn
 
Germany international semester
Germany international semesterGermany international semester
Germany international semester
naranbatn
 
Uni sannio courses_faculty_economics
Uni sannio courses_faculty_economicsUni sannio courses_faculty_economics
Uni sannio courses_faculty_economics
naranbatn
 
Ull ph dproposals
Ull ph dproposalsUll ph dproposals
Ull ph dproposals
naranbatn
 
Ull ph dproposals
Ull ph dproposalsUll ph dproposals
Ull ph dproposals
naranbatn
 
Su pd ph-dproposals
Su pd ph-dproposalsSu pd ph-dproposals
Su pd ph-dproposals
naranbatn
 
Sannio ph dproposals_v4
Sannio ph dproposals_v4Sannio ph dproposals_v4
Sannio ph dproposals_v4
naranbatn
 

Más de naranbatn (20)

эрүүл мэндийн шинжлэх ухаан
эрүүл мэндийн шинжлэх ухаанэрүүл мэндийн шинжлэх ухаан
эрүүл мэндийн шинжлэх ухаан
 
Instructions to authors
Instructions to authorsInstructions to authors
Instructions to authors
 
Төгсөлтийн сургалтыг зохицуулах журам
Төгсөлтийн сургалтыг зохицуулах журамТөгсөлтийн сургалтыг зохицуулах журам
Төгсөлтийн сургалтыг зохицуулах журам
 
хичээлийн хуваарь 2011-2012 1-р улирал
хичээлийн хуваарь 2011-2012 1-р улиралхичээлийн хуваарь 2011-2012 1-р улирал
хичээлийн хуваарь 2011-2012 1-р улирал
 
д.амарсайхан захирал
д.амарсайхан захиралд.амарсайхан захирал
д.амарсайхан захирал
 
ц.лхагвасүрэн захирал
ц.лхагвасүрэн захиралц.лхагвасүрэн захирал
ц.лхагвасүрэн захирал
 
Self eval report english final for printing, 28.09.2011last for printing
Self eval report english final for printing, 28.09.2011last for printingSelf eval report english final for printing, 28.09.2011last for printing
Self eval report english final for printing, 28.09.2011last for printing
 
Бүрдүүлэх материал
Бүрдүүлэх материалБүрдүүлэх материал
Бүрдүүлэх материал
 
мэргэжлийн индекс
мэргэжлийн индексмэргэжлийн индекс
мэргэжлийн индекс
 
Health sciences university of mongolia
Health sciences university of mongoliaHealth sciences university of mongolia
Health sciences university of mongolia
 
Магистрын ганцаарчилсан сургалтын төлөвлөгөө
Магистрын ганцаарчилсан сургалтын төлөвлөгөөМагистрын ганцаарчилсан сургалтын төлөвлөгөө
Магистрын ганцаарчилсан сургалтын төлөвлөгөө
 
Germany summer school-2010
Germany summer school-2010Germany summer school-2010
Germany summer school-2010
 
Uni sannio courses
Uni sannio coursesUni sannio courses
Uni sannio courses
 
Germany summer school-2010
Germany summer school-2010Germany summer school-2010
Germany summer school-2010
 
Germany international semester
Germany international semesterGermany international semester
Germany international semester
 
Uni sannio courses_faculty_economics
Uni sannio courses_faculty_economicsUni sannio courses_faculty_economics
Uni sannio courses_faculty_economics
 
Ull ph dproposals
Ull ph dproposalsUll ph dproposals
Ull ph dproposals
 
Ull ph dproposals
Ull ph dproposalsUll ph dproposals
Ull ph dproposals
 
Su pd ph-dproposals
Su pd ph-dproposalsSu pd ph-dproposals
Su pd ph-dproposals
 
Sannio ph dproposals_v4
Sannio ph dproposals_v4Sannio ph dproposals_v4
Sannio ph dproposals_v4
 

Último

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Research method ch08 statistical methods 2 anova

  • 1. 1 Research Methods in Health Chapter 8. Statistical Methods 2 ANOVA Young Moon Chae, Ph.D. Graduate School of Public Health Yonsei University, Korea ymchae@yuhs.ac
  • 2. 2 Table of Contents • One way ANOVA • Multiple comparison • Repeated measure ANOVA • ANCOVA
  • 4. 4 ANOVA When to use it • Analysis of variance (ANOVA) is the most commonly used technique for comparing the means of groups of measurement data. There are lots of different experimental designs that can be analyzed with different kinds of ANOVA • In a one-way ANOVA (also known as a single-classification ANOVA), there is one measurement variable and one nominal variable. Null hypothesis • The statistical null hypothesis is that the means of the measurement variable are the same for the different categories of data; the alternative hypothesis is that they are not all the same.
  • 5. 5 Rationale for ANOVA (1) • We have at least 3 means to test, e.g., H0: m1 = m2 = m3. • Could take them 2 at a time, but really want to test all 3 (or more) at once. • Instead of using a mean difference, we can use the variance of the group means about the grand mean over all groups. • Logic is just the same as for the t-test. Compare the observed variance among means (observed difference in means in the t-test) to what we would expect to get by chance.
  • 6. 6 ANOVA Assumptions • Data in each group are a random sample from some population. • Observations within groups are independent. • Samples are independent. • Underlying populations normally distributed. • Underlying populations have the same variance – This can formally tested with Bartlett’s test • What might happen (why would it be a problem) if the assumption of {normality, equality of error, independence of error} turned out to be false? -> Use non-parametric statistics or use data transformation
  • 7. 7 Multiple comparisons • When we carry out an ANOVA on k treatments, we test H0 : μ1 = · · · = μk versus Ha : H0 is false • Assume we reject the null hypothesis, i.e. we have some evidence that not all treatment means are equal. Then we could for example be interested in which ones are the same, and which ones differ. • For this, we might have to carry out some more hypothesis tests. • This procedure is referred to as multiple comparisons.
  • 8. 8 Types of multiple comparisons • There are two different types of multiple comparisons procedures: • Sometimes we already know in advance what questions we want to answer. Those comparisons are called planned (or a priori) comparisons. • Sometimes we do not know in advance what questions we want to answer, and the judgment about which group means will be studied the same depends on the ANOVA outcome. Those comparisons are called unplanned (or a posteriori) comparisons. -Planned comparisons: adjust for just those tests that are planned. -Unplanned comparisons: adjust for all possible comparisons.
  • 9. 9 Independence of planned comparison å = j jjcc 021 Comparison A1 A2 A3 A4 1 -1/3 1 -1/3 -1/3 2 -1/2 0 -1/2 1 3 1/2 1/2 -1/2 -1/2 0)1*3/1()2/1*3/1( )0*1()2/1*3/1(21 =-+-- ++--=åj jj cc 3/26/4)2/1*3/1()2/1*3/1( )2/1*1()2/1*3/1(31 ==--+-- ++-=åj jj cc One and two are orthogonal; one and three are not There are J-1 orthogonal comparisons. Use only what you need. .
  • 10. 10 Example 1 • We previously investigated whether the mean blood coagulation times for animals receiving different diets (A, B, C or D) were the same. • Imagine A is the standard diet, and we wish to compare each of diets B, C, D to diet A. → planned comparisons! • After inspecting the treatment means, we find that A and D look similar, and B and C look similar, but A and D are quite different from B and C. We might want to formally test the hypothesis → unplanned comparisons!
  • 11. 11 Example 2 • A plant physiologist recorded the length of pea sections grown in tissue culture with auxin present. The purpose of the experiment was to investigate the effects of various sugars on growth. Four different treatments were used, plus one control (no sugar): No sugar 2% glucose 2% fructose 1% glucose + 1% fructose 2% sucrose • The investigator wants to answer three specific questions: - Does the addition of sugars have an effect on the lengths of the pea sections? - Are there differences between the pure sugar treatments and the mixed sugar treatment? - Are there differences among the pure sugar treatments? Planned comparisons!
  • 13. 13 Bonferroni Correction • Suppose we have 10 treatment groups, and so 45 pairs. • If we perform 45 t-tests at the significance level = 0.05, we’d expect to reject 5% × 45 ≈ 2 of them, even if all of the means were the same. • Let = Pr(reject at least one pairwise test | all μ’s the same) ≤ (no. tests) × Pr(reject test #1 | μ’s the same) • The Bonferroni correction: Use ′ = /(no. tests) as the significance level for each test. • For example, with 10 groups and so 45 pairwise tests, we’d use ′ = 0.05 / 45 ≈ 0.0011 for each test.
  • 14. 14 Post Hoc Tests • Given a significant F, where are the mean differences? • Often do not have planned comparisons. • Usually compare pairs of means. • There are many methods of post hoc (after the fact) tests.
  • 15. 15 Scheffé • Can use for any contrast. Follows same calculations, but uses different critical values. • Instead of comparing the test statistic to a critical value of t, use: )ˆvar(. ˆ y y est t = aFJ )1(S -= Where the F comes from the overall F test (J-1 and N-J df).
  • 16. 16 Scheffé (2) Source SS df MS F Cells (A1- A4) 219 3 73 12.17 Error 72 12 6 Total 291 15 5.3)22*5(.)28*5(. )25*5(.)18*5(.ˆ1 -=-- +=y 2/3 4 )5.()5.(5.5. 6)ˆ(. 2222 = -+-++ =yVarest 86.2 2247.1 5.3 )ˆvar(. ˆ -= - == y y est t 24.349.3)14()1(S =-=-= aFJ 49.3)12,3,05.( ==aF (Data from earlier problem.) The comparison is not significant because |-2.86|<3.24.
  • 17. 17 Paired comparisons •Newman Keuls and Tukey HSD depend on q, the studentized range statistic. • Suppose we have J independent sample means and we find the largest and the smallest. nMS yy q error / minmax - = MS error comes from the ANOVA we did to get the J means. The n refers to sample size per cell. If two cells are unequal, use 2n1n2/(n1+n2). The sampling distribution of q depends on k, the number of means covered by the range (max-min), and on v, the degrees of freedom for MSerror.
  • 18. 18 Tukey HSD HSD = honestly significant difference. For HSD, use k = J, the number of groups in the study. Choose alpha, and find the df for error. Look up the value qα. Then find the value: n MS qHSD error a= Compare HSD to the absolute value of the difference between all pairs of means. Any difference larger than HSD is significant.
  • 19. 19 HSD 2 Grp -> 1 2 3 4 5 M -> 63 82 80 77 70 Source SS df MS F p Grps 2942.4 4 725.6 4.13 <.05 Error 9801.0 55 178.2 K = 5 groups; n=12 per group, v has 55 df. Tabled value of q with alpha =.05 is 3.98. 34.15 12 2.178 98.3 === n MS qHSD error a Group 1 5 4 3 2 1 63 0 7 14 17* 19* 5 70 0 7 10 12 4 77 0 3 5 3 80 0 2 2 82 0
  • 20. 20 Comparing Post Hoc Tests •The Newman-Keuls found 3 significant differences in our example. The HSD found 2 differences. •If we had used the Bonferroni approach, we would have found an interval of 15.91 required for significance (and therefore the same two significant as HSD). Thus, power descends from the Newman-Keuls to the HSD to the Bonferroni. • The type I error rates go just the opposite, the lowest to Bonferroni, then HSD and finally Newman-Keuls. Do you want to be liberal or conservative in your choice of tests? Type I error vs Power.
  • 21. 21 Repeated Measures ANOVA When to Use Repeated Measures ANOVA • Repeated measures ANOVA is used when all members of a random sample are measured under a number of different conditions. As the sample is exposed to each condition in turn, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because the data violate the ANOVA assumption of independence. • This approach is used for several reasons. - Some research hypotheses require repeated measures. Longitudinal research, for example, measures each sample member at each of several ages. In this case, age would be a repeated factor. - When sample members are difficult to recruit, repeated measures designs are economical because each member is measured under all conditions.
  • 22. 22 Statistical Terminology Used in this Document • A sample member is called a subject. • When a dependent variable is measured repeatedly for all sample members across a set of conditions, this set of conditions is called a within-subjects factor. The conditions that constitute this type of factor are called trials. • When a dependent variable is measured on independent groups of sample members, where each group is exposed to a different condition, the set of conditions is called a between-subjects factor. The conditions that constitute this factor type are called groups. • When an analysis has both within-subjects factors and between subjects factors, it is called a repeated measures ANOVA with between-subjects factors
  • 23. 23 Example • Suppose that, as a health researcher, you want to examine the impact of dietary habit and exercise on pulse rate. To investigate these issues, you collect a sample of individuals and group them according to their dietary preferences: meat eaters and vegetarians. You then divide each diet category into three groups, randomly assigning each group to one of three types of exercise: aerobic stair climbing, racquetball, and weight training. So far, then, your design has two between- subjects (grouping) factors: dietary preference and exercise type. • Suppose that, in addition to these between-subjects factors, you want to include a single within-subjects factor in the analysis. Each subject's pulse rate will be measured at three levels of exertion: after warm-up exercises, after jogging, and after running. Thus, intensity (of exertion) is the within-subjects factor in this design. The order of these three measurements will be randomly assigned for each subject
  • 24. 24 Research Questions : Within-Subjects Main Effect • Does intensity influence pulse rate? (Does mean pulse rate change across the trials for intensity?) This is the test for a within-subjects main effect of intensity. Between-Subjects Main Effects • Does dietary preference influence pulse rate? (Do vegetarians have different mean pulse rates than meat eaters?) This is the test for a between-subjects main effect of dietary preference. • Does exercise type influence pulse rate? (Are there differences in mean pulse rates between stair climbers, racquetball players, and weight trainers?) This is the test for a between-subjects main effect of exercise type. Between-Subjects Interaction Effect • Does the influence of exercise type on pulse rate depend on dietary preference? (Does the pattern of differences between mean pulse rates for exercise-type groups change for each dietary-preference group?) This is the test for a between-subjects interaction of exercise type by dietary preference.
  • 25. 25 Results • Diet: With a p value less than .0001, you have a statistically significant effect. You can therefore conclude that a statistically significant difference exists between vegetarians and meat eaters on their overall pulse rates. In other words, there is a main effect for diet. The cell means (not shown here) show that meat eaters experience higher pulse rates than vegetarians. • Exercise type: It is non-significant: F(2, 144) = .31, p=.7341. Thus, you can conclude that the type of exercise has no statistically significant effect on overall mean pulse rates. • The test of the DIET BY EXERTYPE interaction also shows a non-significant result (F(2, 144) = .52, p=.594). This suggests that dietary preferences and type of exercise do not combine to influence the overall average pulse rate. • When an interaction effect is significant, the pattern of cell means must be examined to determine the meaning not only of the interaction, but also the meaning of any main effects involved in the interaction.
  • 26. 26 Analyzing Continuous and Categorical IVs Simultaneously Analysis of Covariance
  • 27. 27 (cont.) When to use it • The purpose of ANCOVA is to compare two or more linear regression lines. It is a way of comparing the Y variable among groups while statistically controlling for variation in Y caused by variation in the X variable. Null hypotheses • Two null hypotheses are tested in an ANCOVA. The first is that the slopes of the regression lines are all the same. If this hypothesis is not rejected, the second null hypothesis is tested: that the Y-intercepts of the regression lines are all the same. • Although the most common use of ANCOVA is for comparing two regression lines, it is possible to compare three or more regressions. If their slopes are all the same, it is then possible to do planned or unplanned comparisons of Y-intercepts, similar to the planned or unplanned comparisons of means in an ANOVA
  • 28. 28 ANCOVA (GLM): Example The General Linear Model (GLM) approach is used to ANCOVA to determine whether MCAT scores are significantly different among medical students who had different types of undergraduate majors, when adjusted for year of matriculation.
  • 29. 29 • Dependent variable § nmtot1: MCAT total (most recent) • Fixed factor § bmaj2: Undergraduate major 1 = Biology/Chemistry 2 = Other science/health 3 = Other • Covariate § matyr: Year of matriculation ANCOVA (GLM): cont.
  • 30. 30 ANCOVA (GLM) Output: : Descriptive Statistics
  • 31. 31 ANCOVA (GLM) Output: : Descriptive Statistics (continued)
  • 32. 32 ANCOVA (GLM) Output: Tests of Between-Subjects Effects
  • 33. 33 ANCOVA (GLM): Estimated Marginal Means: Undergraduate Major
  • 34. 34 ANCOVA (GLM): Post Hoc Tests: Undergraduate Major