Survival.pptx

t-test and ANOVA
Haramaya University
College of Health and Medical Science
Analysis of Quantitative Variable

Comparison of groups by other
numeric variable
The t-test
One-way ANOVA

T-test
We use t-test in either of the following three ways:
a. One sample t-test:
 It is used to compare the estimate of a sample with a hypothesi
zed population mean to see if the sample is significantly differe
nt.
 Assumptions which should be fulfilled before we use this method
:
• The dependent variable is normally distributed within th
e population
• The data are independent (scores of one participant are no
t dependent on scores of the other)

T-test cont…
 Hypothesis: Ho: μ = μo Vs HA: μ≠ μo ,
Where μo is the hypothesized mean value
The test statistics is : tcalc = (x
̄ – μ)/(s/√n)
 We compare the calculated test statistics (tcalc) with the tab
ulated value (ttab) at n-1 degree of freedom

No Distance in
miles
Drug use No Distance in
miles
Drug u
se
1 14.5 no 10 18.4 yes
2 13.4 no 11 16.9 yes
3 14.8 yes 12 12.6 not
4 19.5 yes 13 13.4 not
5 14.5 no 14 16.3 yes
6 18.2 yes 15 17.1 yes
7 16.3 no 16 11.8 not
8 14.8 no 17 13.3 yes
9 20.3 yes 18 14.5 not
Mean 15.59
Standard deviation 2.43
T-test cont…
E.g. Data: The distance covered by marathon runners until a ph
ysiological stress develops and whether they used drug or not

T-test cont..
It is believed that the mean distance covered before fee
ling physiological stress is 15 miles
Hypotheses: Ho: = μ = 15 versus HA: μ ≠ 15
Level of significance: α = 5%
= 15.59, S = 2.43,
tcalc = (x – μ)/(s/√n) = 1.03, and P-value = 0.318
At 17 degree of freedom and α = 0.05, ttab = 2.110,
Since tcal = 1.03 < 2.110 = ttab, or α = 0.05 < 0.318 =p-value
we fail to reject Ho
x
̄

T-test cont..
b. Paired t- test
 Each observation in one sample has one and only one mate in th
e other sample dependent to each other.
 For example, the independent variable can be measurements like:
before and after (e.g before and after an intervention), or repeat
ed measurement (e.g. using digital and analog apparatus), or wh
en the two data sources are dependent (e.g. data from mother a
nd father of respondent)
Hypothesis: Ho: μd = 0 Vs HA: μd ≠ 0

T-test cont..
Subject BP before BP after Difference (di)
1 130 110 -20
2 125 130 +5
3 140 120 -20
4 150 130 -20
5 120 110 -10
6 130 130 0
7 120 115 -5
8 135 130 -5
9 140 130 -10
10 130 120 -10
d (Average of d) -9.5
Sd (Standard deviation of d) 8.64
Example : The blood pressure (BP) of 10 mothers were measured before a
nd after taking a new drug.

T-test cont..
Hypothesis: Ho: μd = 0 Versus HA: μd ≠ 0
Set the level of significance or α = 0.05
d = -9.5, Sd = 8.64, n = 10,.
tcalc = (d – μd)/(sd/√n) = 3.48 and p-value = 0.0075,
At n-1 = 9 df and α = 0.05, ttab = 2.26
Since ttab = 2.26 < 3.48 = tcalc or p-value = 0.0075 < 0.05 = α
We reject Ho

T-test cont..
c. Two independent samples t-test
 Used to compare two unrelated or independent groups
 Assumptions include:
The variance of the dependent variable in the two popul
ations are equal
The dependent variable is normally distributed within
each population
The data are independent (scores of one participant are
not related systematically to the scores of the others)
 Hypothesis: Ho: μt = μc Vs HA: μt ≠ μc ,
Where μt and μc are the population mean of treatment and cont
rol (placebo) groups respectively.

T-test cont..
 The test statistics is:
tcalc = (x t – xc)/√S2/(nt + nc),
 Where S2 = {(nt-1)St
2 + (nc-1)Sc
2}/(nt +nc -2)
 S2 = is the pooled (combined) variance of both groups.
 We compare the tcalc with the tabulated at n1+n2 -2 degree of
freedom and decide accordingly

Example
Do the marathon runners grouped by their drug intake status differ in th
eir average distance coverage before they feel any physiological stress?
Hypothesis: Ho: μt = μc Vs HA: μt ≠ μc, where μt and μc are for drug user
s and non-users respectively
Set the level of significance, α = 5%,
xc = 13.98, sc = 1.33, xt = 17.20, st= 2.21
tcalc = (xc – xt)/√S2/(nc + nt) = -3.741, and its p-value = 0.002
S2 = is the pooled (combined) variance of both groups.
At 16 df and α = 0.05, ttab = -2.12
Since tcal= |-3.741| > |-2.12|, or P-Value = 0.002 < 0.05 = α
We reject Ho

T-test cont…
 Here in the case of two independent sample t-test, we hav
e one continuous dependent variable (interval/ratio data) a
nd;
 one nominal or ordinal independent variable with only tw
o categories
 In this last case (i.e. two independent
sample t-test), what if there are mor
e than two categories for the indep
endent variable we have?

One way-Analysis Of Variance
(One-way ANOVA)
 Are the birth weights of children in different geographical re
gions the same?
 Are the responses of patients to different medications and pl
acebo different?
 Are people with different age groups have different proportio
n of body fat?
 Do people from different ethnicity have the same BMI?

One way-Analysis Of Variance
cont…
 All the above research questions have one common characte
ristic: That is each of them has two variables: one categoric
al and one quantitative
 Main question: Are the averages of the quantitative variable
across the groups (categories) the same?
 Because there is only one categorical independent variable
which has two or more categories (groups), the name one w
ay ANOVA comes.

16
One-way ANOVA cont…
 Also called Completely Randomized Design
 Experimental units (subjects) are assigned randomly
to treatments/groups. Here subjects are assumed to b
e homogeneous

17
 One way ANOVA is a method for testing the hypothesis:
 More formally, we can state hypotheses as:
H0: There is no difference among the mean of treatments effects
HA: There is difference at least between two treatments effects
or
Ho: µ1 = µ2 = µ3 =…. = µa (if there are ‘a’ groups)
HA: at least one group mean is different
There is no difference between two or more population m
eans (usually at least three); or
There is no difference between a number of treatments
Analysis of variance cont…

Why Not Just Use t-tests?
 Since t-test considers two groups at a time, it will be ted
ious when many groups are present
 Conducting multiple t-tests can lead also to severe infla
tion of the Type I error rate (false positives) and is not r
ecommended
 However, ANOVA is used to test for differences amon
g several means without increasing the Type I error rate
 The ANOVA uses data from all groups at a time to est
imate standard errors, which can increase the power o
f the analysis

Assumptions of One Way ANOVA)
 The data are normally distributed or the samples have
come from normally distributed populations and are ind
ependent.
 The variance is the same in each group to be compared
(equal variance).
 Moderate departures from normality may be safely ignored,
but the effect of unequal standard deviations may be serious.

 We test the equality of means among groups by using the va
riance
 The difference between variation within groups and variat
ion between groups may help us to compare the means
 If both are equal, it is likely that the observed difference is d
ue to chance and not real difference
Note that:
Total Variability = Variability between + Variability within
Analysis of variance cont…

μ
G-1 G-2
Basic model: Data are deviations fro
m the global mean, μ:
Xij = μ + Ɛij
 Sum of vertical deviations squared is
the total sum of squares = SSt
G-1 G-2
 One way model: Data are deviations fr
om treatment means, Ais:
Xij = μ + Ai + Ɛij
 Sum of vertical deviations squared = SSe
 Note that ΣAi = ΣƐij = 0
A1
A2
Analysis of variance

22
Decomposing the total variability
n a n a n a
 Total SS = Σ Σ (xij – )2 = ΣiΣjxij
2 - (ΣiΣjxij)2 /na = SST
i=1 j=1
n a n a a n
 Within SS = Σ Σ (xij – j)2 = ΣiΣjxij
2 - Σj(Σixij)2/n = SSW
i=1 j=1
n a a n
 Between SS = Σ Σ ( i j – )2 = Σj(Σixij)2/n - (ΣiΣjxij)2 /na = SSB
i=1 j=1
This is assuming each of the ‘a’ groups has equal size, ‘n’.
SST = SSW + SSB

Data of one way ANOVA
Groups/variable
G-1 G-2 G-3 ….. G-a
X11 X12 X13 ….. X1a
X21 X22 X23 ….. X2a
X31 X32 X33 ….. X3a
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Xn1 Xn2 Xn3 …. Xna
Totals T.1 T.2 T.3 …. T.a
T= ΣiΣjxij
2 Correction Factor = CF = (ΣiΣjxij)2 /na = T2../na
A = Σj(Σixij)2/n = Σj(T.j)2/n if the groups’ (cells’) size are equal, or
A = Σj(Σixij)2/nj = Σj(T.j)2/nj ; if unequal group size
Where, Xij = ith observation in the jth group of the table
i = 1, 2, 3,…, nj, j = 1, 2, 3,…,a, Σjnj = N
Participants
Computational formula

Sum of squares and ANOVA Table
 If there are real differences among groups’ means, the between gr
oups variation will be larger than the within variation
Source of variati
on
df SS MS F
Between groups a-1 SSB = A - CF SSB/(a-1) MSB/MSW
Within groups na-a SSW = T - A SSW/(na –a)
Total na-1 SST = T - CF

Example on one-way ANOVA
The following table shows the red cell folate levels (μg/l) in three groups of card
iac bypass patients who were given three different levels of nitrous oxide ventil
ation. (Level of nitrous oxide for group I > group II’s > group III’s)
Group I
(n=8)
Group II
(n=9)
Group III
(n=5)
243
251
275
291
347
354
380
392
Total=2533
Mean =316.6
SD = 58.7
206
210
226
249
255
273
285
295
309
2308
256.4
37.1
241
258
270
293
328
1390
278.0
33.8

Example Cont….
We can see the box plot just to have some impres
sion about it

Example cont…
Ho: μ1 = μ2 = μ3
HA: Differences exist between at least two of the means
Since the P-value is less than 0.05, the null hypothesis is rejected
Source of variation df SS Mean
square
F P
Between groups
Within groups
2
19
15516
39716
7758
2090
3.71 0.044
Total 21 55232

Pair-wise comparisons of group means p
ost hoc tests or multiple comparisons
 ANOVA test tells us only whether there is statistically significant
difference among groups means, but
 It doesn’t tell us which particular groups are significantly diff
erent
 To identify them, we use either a priori (pre-planed) or post hoc t
ests

Pair-wise comparisons of group means (pos
t hoc tests) cont…
 When you look at the data it may seem worth comparing all possib
le pairs.
 In this case, a post hoc test such as
 Scheffe, Benferroni (modified t-test),
 Tuckey methods,
 Least Squares Difference (LSD), etc will be employed.

Benferroni method or Modified t-test (Steps)
I. Find tcalc for the pairs of groups of interest (to be compared)
II. The modified t-test is based on the pooled estimate of varianc
e from all the groups (which is the residual variance in the
ANOVA table), not just from pair being considered.
III. If we perform k paired comparisons, then we should multipl
y the P value obtained from each test by k; that is, we calculat
e P' = kP with the restriction that P' cannot exceed 1.
Where, , that is the number of possible comparisons

Benferroni method or Modified t-test
 Returning to the red cell folate data given above, the residual sta
ndard deviation is = 45.72.
(a) Comparing groups I and II
t = (316.6 - 256.4) / (45.72 x √(1/9 +1/8)
= 2.71 on 19 degrees of freedom.
 The corresponding P-value = 0.014 and the
corrected P value is P' = 0.014x3
= 0.042
Group I and II are different

(b) Comparing groups I and III
 t = (316.6 - 278.0) / (45.72 x √(1/8+1/5)
= 38.6/26.06
 The corresponding P value = 0.1625 and
 The corrected P value is P' = 0.1625x3
= 0.4875
Group I and III are not different

(c) Comparing Groups II and III
 t = (278 - 256.4) / (45.72 x √(1/5+1/9)
= 21.6/25.5
 The corresponding P value = 0.425 and the corrected P value i
s P' = 1.00
Group I and III are not different
Therefore, the main explanation for the difference between th
e groups that was identified in the ANOVA is thus the differen
ce between groups I and II.

Which post hoc method Shall I use? cont…
 The Bonferroni approach uses a series of t tests ( that is t
he LSD technique) but corrects the significance level fo
r multiple testing by dividing the significance levels by t
he number of tests being performed
 Since this test corrects for the number of comparisons be
ing performed, it is generally used when the number of g
roups to be compared is small.

Which post hoc method Shall I use? Cont..
 Tukey’s Honesty Significance Difference (Tukey’s HSD) te
st also corrects for multiple comparisons, but it considers th
e power of the study to detect differences between groups ra
ther than just the number of tests being carried out;
 That is, it takes into account sample size as well as the num
ber of tests being performed.
 This makes it preferable when there are a large number of gr
oups being compared, since it reduces the chances of a Type
I error occurring.

One way ANOVA’s limitations
 This technique is only applicable when there is one tre
atment used.
 Note that this single treatment can have 3, 4,… ,many
levels.
 Thus nutrition trial on children weight gain with 4 diff
erent feeding styles could be analyzed this way, but a t
rial of BOTH nutrition and mothers health status coul
d not

Survival.pptx

Recomendados

Recomendados

Más contenido relacionado

Similar a Survival.pptx

Similar a Survival.pptx (20)

Más de MohammedAbdela7

Más de MohammedAbdela7 (20)

Último

Último (20)

Survival.pptx

Notas del editor