SlideShare una empresa de Scribd logo
1 de 40
1
Cox Regression II
Kristin Sainani Ph.D.
http://www.stanford.edu/~kcobb
Stanford University
Department of Health Research and Policy
2
Topics
 Stratification
 Age as time scale
 Residuals
 Repeated events
 Intention-to-treat analysis for RCTs
3
1. Stratification
Violations of PH assumption can be resolved by:
•Adding time*covariate interaction
•Adding other time-dependent version of the covariate
•Stratification
4
Stratification
•Different stratum are allowed to have different baseline
hazard functions.
•Hazard functions do not need to be parallel between
different stratum.
•Essentially results in a “weighted” hazard ratio being
estimated: weighted over the different strata.
•Useful for “nuisance” confounders (where you do not care
to estimate the effect).
•Does not allow you to evaluate interaction or confounding
of stratification variable (will miss possible interactions).
5
 Males: 1, 3, 4, 10+, 12, 18 (subjects 1-6)
 Females: 1, 4, 5, 9+ (subjects 7-10)
Example: stratify on gender
)....
)5()5(
)5(
(
)
)1()4()4(
)4(
()
)4()4()4()4(
)4(
()
)3()3()3()3()3(
)3(
(
)
)1()1()1()1(
)1(
()
)1()1()1()1()1()1(
)1(
()(
109
1098
8
6543
3
65432
2
10987
7
654321
1
1
hh
h
x
hhh
h
x
hhhh
h
x
hhhhh
h
hhhh
h
x
hhhhhh
h
LL
m
i
ip
+
++++++++++++
++++++++
== ∏=
β
♀
♂
♀ ♀
♂
♂
6
The PL
....
)
)1()1()1()1(
)1(
(
)
)1()1()1()1()1()1(
)1(
(
)(
10987
7
654321
1
0000
0
000000
0
1
βxβxβxβx
βx
βxβxβxβxβxβx
βx
β
eeee
e
x
eeeee
et
LL
ffff
f
mmmmmm
m
m
i
ip
λλλλ
λ
λλλλλλ
λ
+++
+++++
=
== ∏=
)...()()( 10987
7
654321
1
1
βxβxβxβx
βx
βxβxβxβxβxβx
βx
β
eeee
e
x
eeeee
e
LL
m
i
ip
++++++++
==∴ ∏=
♀
♂
7
 Age is a common confounder in Cox
Regression, since age is strongly related to
death and disease.
 You may control for age by adding baseline
age as a covariate to the Cox model.
 A better strategy for large-scale longitudinal
surveys, such as NHANES, is to use age as
your time-scale (rather than time-in-study).
 You may additionally stratify on birth cohort to
control for cohort effects.
2. Using age as the time-scale
in Cox Regression
8
Age as time-scale
 The risk set becomes everyone who was at
risk at a certain age rather than at a certain
event time.
 The risk set contains everyone who was still
event-free at the age of the person who had
the event.
 Requires enough people at risk at all ages
(such as in a large-scale, longitudinal survey).
9
The likelihood with age as time
Event times: 3, 5, 7+, 12, 13+ (years-in-study)
Baseline ages: 28, 25, 40, 29, 30 (years)
Age at event or censoring: 31, 30, 47+, 41, 43+
)
)41()41()41(
)41(
()
)31()31()31(
)31(
(
)
)30()30()30()30(
)30(
()(
543
4
541
1
5421
2
1
hhh
h
x
hhh
h
x
hhhh
h
LL
m
i
ip
+++++
+++
== ∏=
β
10
3. Residuals
 Residuals are used to investigate the
lack of fit of a model to a given subject.
 For Cox regression, there’s no easy
analog to the usual “observed minus
predicted” residual of linear regression
11
Martingale residual
 ci (1 if event, 0 if censored) minus the estimated
cumulative hazard to ti (as a function of fitted model) for
individual i:
ci-H(ti,Xi, ßi)

E.g., for a subject who was censored at 2 months, and whose predicted
cumulative hazard to 2 months was 20%
 Martingale=0-.20 = -.20

E.g., for a subject who had an event at 13 months, and whose predicted
cumulative hazard to 13 months was 50%:
 Martingale=1-.50 = +.50
 Gives excess failures.
 Martingale residuals are not symmetrically distributed,
even when the fitted model is correctly, so transform to
deviance residuals...
12
Deviance Residuals
 The deviance residual is a normalized
transform of the martingale residual.
These residuals are much more
symmetrically distributed about zero.
 Observations with large deviance
residuals are poorly predicted by the
model.
13
Deviance Residuals
 Behave like residuals from ordinary linear
regression
 Should be symmetrically distributed around 0
and have standard deviation of 1.0.
 Negative for observations with longer than
expected observed survival times.
 Plot deviance residuals against covariates to
look for unusual patterns.
14
Deviance Residuals
 In SAS, option on the output statement:
Output out=outdata resdev=Varname
**Cannot get diagnostics in SAS if time-
dependent covariate in the model
15
Example: uis data
Out of 628
observations,
a few in the
range of 3-SD
is not
unexpected
Pattern looks
fairly symmetric
around 0.
16
Example: uis data
What do you
think this
cluster
represents?
17
Example: censored only
18
Example: had event only
19
Schoenfeld residuals
 Schoenfeld (1982) proposed the first set of
residuals for use with Cox regression
packages
 Schoenfeld D. Residuals for the proportional
hazards regresssion model. Biometrika, 1982,
69(1):239-241.
 Instead of a single residual for each
individual, there is a separate residual for
each individual for each covariate
 Note: Schoenfeld residuals are not defined for
censored individuals.
20
Schoenfeld residuals
 The Schoenfeld residual is defined as the covariate
value for the individual that failed minus its expected
value. (Yields residuals for each individual who failed,
for each covariate).
 Expected value of the covariate at time ti = a weighted-
average of the covariate, weighted by the likelihood of
failure for each individual in the risk set at ti.
∑
∈
=
−=
)(
1
residual
itRj
i
jjkik pxx
personifor thenoweventofyprobabilit
(age)years56.,.e
th
setrisk
1
=
− ∑
∈
=
j
p
pg
j
i
j
The person who died was 56; based on the fitted
model, how likely is it that the person who died
was 56 rather than older?
21
Example
 5 people left in our risk set at event
time=7 months:

Female 55-year old smoker

Male 45-year old non-smoker

Female 67-year old smoker

Male 58-year old smoker

Male 70-year old non-smoker
The 55-year old female smoker is the one who
has the event…
22
Example
Based on our model, we can calculate a
predicted probability of death by time 7 for
each person (call it “p-hat”):

Female 55-year old smoker: p-hat=.10

Male 45-year old non-smoker : p-hat=.05

Female 67-year old smoker : p-hat=.30

Male 58-year old smoker : p-hat=.20

Male 70-year old non-smoker : p-hat=.30
Thus, the expected value for the AGE of the person who
failed is:
55(.10) + 45 (.05) + 67(.30) + 58 (.20) + 70 (.30)= 60
And, the Schoenfeld residual is: 55-60 = -5
23
Example
Based on our model, we can calculate a
predicted probability of death by time 7 for
each person (call it “p-hat”):

Female 55-year old smoker: p-hat=.10

Male 45-year old non-smoker : p-hat=.05

Female 67-year old smoker : p-hat=.30

Male 58-year old smoker : p-hat=.20

Male 70-year old non-smoker : p-hat=.30
The expected value for the GENDER of the person who
failed is:
0(.10) + 1(.05) + 0(.30) + 1 (.20) + 1 (.30)= .55
And, the Schoenfeld residual is: 0-.55 = -.55
24
Schoenfeld residuals
 Since the Schoenfeld residuals are, in
principle, independent of time, a plot that
shows a non-random pattern against time is
evidence of violation of the PH assumption.
 Plot Schoenfeld residuals against time to evaluate
PH assumption
 Regress Schoenfeld residuals against time to test
for independence between residuals and time.
25
Example: no pattern with time
26
Example: violation of PH
27
Schoenfeld residuals
In SAS:
option on the output statement:
Output out=outdata ressch= Covariate1
Covariate2 Covariate3
28
Summary of the many ways to
evaluate PH assumption…
1. Examine log(-log(S(t)) plots
PH assumption is supported by parallel lines and refuted by lines that cross
or nearly cross
Must use categorical predictors or categories of a continuous predictor
2. Include interaction with time in the model
PH assumption is supported by non-significant interaction coefficient and
refuted by significant interaction coefficient
Retaining the interaction term in the model corrects for the violation of PH
Don’t complicate your model in this way unless it’s absolutely necessary!
3. Plot Schoenfeld residuals
PH assumption is supported by a random pattern with time and refuted by a
non-random pattern
4. Regress Schoenfeld residuals against time to test for
independence between residuals and time.
PH assumption is supported by a non-significant relationship between
residuals and time, and refuted by a significant relationship
29
 Death (presumably) can only happen
once, but many outcomes could happen
twice…
 Fractures
 Heart attacks
 Pregnancy
Etc…
4. Repeated events
30
 Strategy 1: run a second Cox
regression (among those who had a
first event) starting with first event time
as the origin
 Repeat for third, fourth, fifth, events,
etc.
 Problems: increasingly smaller and smaller
sample sizes.
Repeated events: 1
31
 Treat each interval as a distinct
observation, such that someone who
had 3 events, for example, gives 3
observations to the dataset
 Major problem: dependence between the
same individual
Repeated events: Strategy 2
32
 Stratify by individual (“fixed effects partial
likelihood”)
 In PROC PHREG: strata id;
 Problems:
 does not work well with RCT data
 requires that most individuals have at least 2
events
 Can only estimate coefficients for those covariates
that vary across successive spells for each
individual; this excludes constant personal
characteristics such as age, education, gender,
ethnicity, genotype
Strategy 3
33
5. Considerations when
analyzing data from an RCT…
34
Intention-to-Treat Analysis
Intention-to-treat analysis: compare
outcomes according to the groups to
which subjects were initially assigned,
regardless of which intervention they
actually received.
Evaluates treatment effectiveness
rather than treatment efficacy
35
Why intention to treat?
 Non-intention-to-treat analyses lose the
benefits of randomization, as the groups may
no longer be balanced with regards to factors
that influence the outcome.
 Intention-to-treat analysis simulates “real life,”
where patients often don’t adhere perfectly to
treatment or may discontinue treatment
altogether.
36
Drop-ins and Drop-outs:
example, WHI
Both women on
placebo and women
on active treatment
discontinued study
medications.
Women on placebo “dropped
in” to treatment because
their regular doctors put
them on hormones (dogma=
“hormones are good”).
Women on treatment
“dropped in” to treatment
because their doctors took
them off study drugs and
put them on hormones to
insure they were on
hormones and not placebo.
Women’s Health Initiative Writing
Group. JAMA. 2002;288:321-333.
37
Effect of Intention to treat on
the statistical analysis
 Intention-to-treat analyses tend to
underestimate treatment effects;
increased variability “waters down”
results.
38
Example
Take the following hypothetical RCT:
Treated subjects have a 25% chance of dying during the 2-year
study vs. placebo subjects have a 50% chance of dying.
TRUE RR= 25%/50% = .50 (treated have 50% less chance of
dying)
You do a 2-yr RCT of 100 treated and 100 placebo subjects.
If nobody switched, you would see about 25 deaths in the treated
group and about 50 deaths in the placebo group (give or take a
few due to random chance).
∴Observed RR≅ .50
39
Example, continued
BUT, if early in the study, 25 treated subjects switch
to placebo and 25 placebo subjects switch to
control.
You would see about
25*.25 + 75*.50 = 43-44 deaths in the placebo group
And about
25*.50 + 75*.25 = 31 deaths in the treated group
Observed RR = 31/44 ≅ .70
Diluted effect!
40
References
Paul Allison. Survival Analysis Using SAS. SAS Institute
Inc., Cary, NC: 2003.

Más contenido relacionado

Similar a 33111

lecture1 on survival analysis HRP 262 class
lecture1 on survival analysis HRP 262 classlecture1 on survival analysis HRP 262 class
lecture1 on survival analysis HRP 262 classTroyTeo1
 
Basic survival analysis
Basic survival analysisBasic survival analysis
Basic survival analysisMike LaValley
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
Quantum Models of Cognition and Decision
Quantum Models of Cognition and DecisionQuantum Models of Cognition and Decision
Quantum Models of Cognition and DecisionCatarina Moreira
 
The two sample t-test
The two sample t-testThe two sample t-test
The two sample t-testChristina K J
 
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...Carlos M Martínez M
 
Health probabilities & estimation of parameters
Health probabilities & estimation of parameters Health probabilities & estimation of parameters
Health probabilities & estimation of parameters KwambokaLeonidah
 
binomial distribution
binomial distributionbinomial distribution
binomial distributionMmedsc Hahm
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributionsRajaKrishnan M
 
2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.WeihanKhor2
 
4 1 probability and discrete probability distributions
4 1 probability and discrete    probability distributions4 1 probability and discrete    probability distributions
4 1 probability and discrete probability distributionsLama K Banna
 
Use Proportional Hazards Regression Method To Analyze The Survival of Patient...
Use Proportional Hazards Regression Method To Analyze The Survival of Patient...Use Proportional Hazards Regression Method To Analyze The Survival of Patient...
Use Proportional Hazards Regression Method To Analyze The Survival of Patient...Waqas Tariq
 
Survival analysis 1
Survival analysis 1Survival analysis 1
Survival analysis 1KyusonLim
 
MATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAY
MATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAYMATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAY
MATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAYcscpconf
 

Similar a 33111 (20)

lecture1 on survival analysis HRP 262 class
lecture1 on survival analysis HRP 262 classlecture1 on survival analysis HRP 262 class
lecture1 on survival analysis HRP 262 class
 
Basic survival analysis
Basic survival analysisBasic survival analysis
Basic survival analysis
 
Survival analysis
Survival analysisSurvival analysis
Survival analysis
 
Part 2 Cox Regression
Part 2 Cox RegressionPart 2 Cox Regression
Part 2 Cox Regression
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
Quantum Models of Cognition and Decision
Quantum Models of Cognition and DecisionQuantum Models of Cognition and Decision
Quantum Models of Cognition and Decision
 
The two sample t-test
The two sample t-testThe two sample t-test
The two sample t-test
 
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
Health probabilities & estimation of parameters
Health probabilities & estimation of parameters Health probabilities & estimation of parameters
Health probabilities & estimation of parameters
 
binomial distribution
binomial distributionbinomial distribution
binomial distribution
 
Ch01_03.ppt
Ch01_03.pptCh01_03.ppt
Ch01_03.ppt
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributions
 
Probability.pptx
Probability.pptxProbability.pptx
Probability.pptx
 
2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.
 
4 1 probability and discrete probability distributions
4 1 probability and discrete    probability distributions4 1 probability and discrete    probability distributions
4 1 probability and discrete probability distributions
 
Part 1 Survival Analysis
Part 1 Survival AnalysisPart 1 Survival Analysis
Part 1 Survival Analysis
 
Use Proportional Hazards Regression Method To Analyze The Survival of Patient...
Use Proportional Hazards Regression Method To Analyze The Survival of Patient...Use Proportional Hazards Regression Method To Analyze The Survival of Patient...
Use Proportional Hazards Regression Method To Analyze The Survival of Patient...
 
Survival analysis 1
Survival analysis 1Survival analysis 1
Survival analysis 1
 
MATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAY
MATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAYMATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAY
MATHEMATICAL MODELLING OF EPIDEMIOLOGY IN PRESENCE OF VACCINATION AND DELAY
 

33111

  • 1. 1 Cox Regression II Kristin Sainani Ph.D. http://www.stanford.edu/~kcobb Stanford University Department of Health Research and Policy
  • 2. 2 Topics  Stratification  Age as time scale  Residuals  Repeated events  Intention-to-treat analysis for RCTs
  • 3. 3 1. Stratification Violations of PH assumption can be resolved by: •Adding time*covariate interaction •Adding other time-dependent version of the covariate •Stratification
  • 4. 4 Stratification •Different stratum are allowed to have different baseline hazard functions. •Hazard functions do not need to be parallel between different stratum. •Essentially results in a “weighted” hazard ratio being estimated: weighted over the different strata. •Useful for “nuisance” confounders (where you do not care to estimate the effect). •Does not allow you to evaluate interaction or confounding of stratification variable (will miss possible interactions).
  • 5. 5  Males: 1, 3, 4, 10+, 12, 18 (subjects 1-6)  Females: 1, 4, 5, 9+ (subjects 7-10) Example: stratify on gender ).... )5()5( )5( ( ) )1()4()4( )4( () )4()4()4()4( )4( () )3()3()3()3()3( )3( ( ) )1()1()1()1( )1( () )1()1()1()1()1()1( )1( ()( 109 1098 8 6543 3 65432 2 10987 7 654321 1 1 hh h x hhh h x hhhh h x hhhhh h hhhh h x hhhhhh h LL m i ip + ++++++++++++ ++++++++ == ∏= β ♀ ♂ ♀ ♀ ♂ ♂
  • 7. 7  Age is a common confounder in Cox Regression, since age is strongly related to death and disease.  You may control for age by adding baseline age as a covariate to the Cox model.  A better strategy for large-scale longitudinal surveys, such as NHANES, is to use age as your time-scale (rather than time-in-study).  You may additionally stratify on birth cohort to control for cohort effects. 2. Using age as the time-scale in Cox Regression
  • 8. 8 Age as time-scale  The risk set becomes everyone who was at risk at a certain age rather than at a certain event time.  The risk set contains everyone who was still event-free at the age of the person who had the event.  Requires enough people at risk at all ages (such as in a large-scale, longitudinal survey).
  • 9. 9 The likelihood with age as time Event times: 3, 5, 7+, 12, 13+ (years-in-study) Baseline ages: 28, 25, 40, 29, 30 (years) Age at event or censoring: 31, 30, 47+, 41, 43+ ) )41()41()41( )41( () )31()31()31( )31( ( ) )30()30()30()30( )30( ()( 543 4 541 1 5421 2 1 hhh h x hhh h x hhhh h LL m i ip +++++ +++ == ∏= β
  • 10. 10 3. Residuals  Residuals are used to investigate the lack of fit of a model to a given subject.  For Cox regression, there’s no easy analog to the usual “observed minus predicted” residual of linear regression
  • 11. 11 Martingale residual  ci (1 if event, 0 if censored) minus the estimated cumulative hazard to ti (as a function of fitted model) for individual i: ci-H(ti,Xi, ßi)  E.g., for a subject who was censored at 2 months, and whose predicted cumulative hazard to 2 months was 20%  Martingale=0-.20 = -.20  E.g., for a subject who had an event at 13 months, and whose predicted cumulative hazard to 13 months was 50%:  Martingale=1-.50 = +.50  Gives excess failures.  Martingale residuals are not symmetrically distributed, even when the fitted model is correctly, so transform to deviance residuals...
  • 12. 12 Deviance Residuals  The deviance residual is a normalized transform of the martingale residual. These residuals are much more symmetrically distributed about zero.  Observations with large deviance residuals are poorly predicted by the model.
  • 13. 13 Deviance Residuals  Behave like residuals from ordinary linear regression  Should be symmetrically distributed around 0 and have standard deviation of 1.0.  Negative for observations with longer than expected observed survival times.  Plot deviance residuals against covariates to look for unusual patterns.
  • 14. 14 Deviance Residuals  In SAS, option on the output statement: Output out=outdata resdev=Varname **Cannot get diagnostics in SAS if time- dependent covariate in the model
  • 15. 15 Example: uis data Out of 628 observations, a few in the range of 3-SD is not unexpected Pattern looks fairly symmetric around 0.
  • 16. 16 Example: uis data What do you think this cluster represents?
  • 19. 19 Schoenfeld residuals  Schoenfeld (1982) proposed the first set of residuals for use with Cox regression packages  Schoenfeld D. Residuals for the proportional hazards regresssion model. Biometrika, 1982, 69(1):239-241.  Instead of a single residual for each individual, there is a separate residual for each individual for each covariate  Note: Schoenfeld residuals are not defined for censored individuals.
  • 20. 20 Schoenfeld residuals  The Schoenfeld residual is defined as the covariate value for the individual that failed minus its expected value. (Yields residuals for each individual who failed, for each covariate).  Expected value of the covariate at time ti = a weighted- average of the covariate, weighted by the likelihood of failure for each individual in the risk set at ti. ∑ ∈ = −= )( 1 residual itRj i jjkik pxx personifor thenoweventofyprobabilit (age)years56.,.e th setrisk 1 = − ∑ ∈ = j p pg j i j The person who died was 56; based on the fitted model, how likely is it that the person who died was 56 rather than older?
  • 21. 21 Example  5 people left in our risk set at event time=7 months:  Female 55-year old smoker  Male 45-year old non-smoker  Female 67-year old smoker  Male 58-year old smoker  Male 70-year old non-smoker The 55-year old female smoker is the one who has the event…
  • 22. 22 Example Based on our model, we can calculate a predicted probability of death by time 7 for each person (call it “p-hat”):  Female 55-year old smoker: p-hat=.10  Male 45-year old non-smoker : p-hat=.05  Female 67-year old smoker : p-hat=.30  Male 58-year old smoker : p-hat=.20  Male 70-year old non-smoker : p-hat=.30 Thus, the expected value for the AGE of the person who failed is: 55(.10) + 45 (.05) + 67(.30) + 58 (.20) + 70 (.30)= 60 And, the Schoenfeld residual is: 55-60 = -5
  • 23. 23 Example Based on our model, we can calculate a predicted probability of death by time 7 for each person (call it “p-hat”):  Female 55-year old smoker: p-hat=.10  Male 45-year old non-smoker : p-hat=.05  Female 67-year old smoker : p-hat=.30  Male 58-year old smoker : p-hat=.20  Male 70-year old non-smoker : p-hat=.30 The expected value for the GENDER of the person who failed is: 0(.10) + 1(.05) + 0(.30) + 1 (.20) + 1 (.30)= .55 And, the Schoenfeld residual is: 0-.55 = -.55
  • 24. 24 Schoenfeld residuals  Since the Schoenfeld residuals are, in principle, independent of time, a plot that shows a non-random pattern against time is evidence of violation of the PH assumption.  Plot Schoenfeld residuals against time to evaluate PH assumption  Regress Schoenfeld residuals against time to test for independence between residuals and time.
  • 27. 27 Schoenfeld residuals In SAS: option on the output statement: Output out=outdata ressch= Covariate1 Covariate2 Covariate3
  • 28. 28 Summary of the many ways to evaluate PH assumption… 1. Examine log(-log(S(t)) plots PH assumption is supported by parallel lines and refuted by lines that cross or nearly cross Must use categorical predictors or categories of a continuous predictor 2. Include interaction with time in the model PH assumption is supported by non-significant interaction coefficient and refuted by significant interaction coefficient Retaining the interaction term in the model corrects for the violation of PH Don’t complicate your model in this way unless it’s absolutely necessary! 3. Plot Schoenfeld residuals PH assumption is supported by a random pattern with time and refuted by a non-random pattern 4. Regress Schoenfeld residuals against time to test for independence between residuals and time. PH assumption is supported by a non-significant relationship between residuals and time, and refuted by a significant relationship
  • 29. 29  Death (presumably) can only happen once, but many outcomes could happen twice…  Fractures  Heart attacks  Pregnancy Etc… 4. Repeated events
  • 30. 30  Strategy 1: run a second Cox regression (among those who had a first event) starting with first event time as the origin  Repeat for third, fourth, fifth, events, etc.  Problems: increasingly smaller and smaller sample sizes. Repeated events: 1
  • 31. 31  Treat each interval as a distinct observation, such that someone who had 3 events, for example, gives 3 observations to the dataset  Major problem: dependence between the same individual Repeated events: Strategy 2
  • 32. 32  Stratify by individual (“fixed effects partial likelihood”)  In PROC PHREG: strata id;  Problems:  does not work well with RCT data  requires that most individuals have at least 2 events  Can only estimate coefficients for those covariates that vary across successive spells for each individual; this excludes constant personal characteristics such as age, education, gender, ethnicity, genotype Strategy 3
  • 33. 33 5. Considerations when analyzing data from an RCT…
  • 34. 34 Intention-to-Treat Analysis Intention-to-treat analysis: compare outcomes according to the groups to which subjects were initially assigned, regardless of which intervention they actually received. Evaluates treatment effectiveness rather than treatment efficacy
  • 35. 35 Why intention to treat?  Non-intention-to-treat analyses lose the benefits of randomization, as the groups may no longer be balanced with regards to factors that influence the outcome.  Intention-to-treat analysis simulates “real life,” where patients often don’t adhere perfectly to treatment or may discontinue treatment altogether.
  • 36. 36 Drop-ins and Drop-outs: example, WHI Both women on placebo and women on active treatment discontinued study medications. Women on placebo “dropped in” to treatment because their regular doctors put them on hormones (dogma= “hormones are good”). Women on treatment “dropped in” to treatment because their doctors took them off study drugs and put them on hormones to insure they were on hormones and not placebo. Women’s Health Initiative Writing Group. JAMA. 2002;288:321-333.
  • 37. 37 Effect of Intention to treat on the statistical analysis  Intention-to-treat analyses tend to underestimate treatment effects; increased variability “waters down” results.
  • 38. 38 Example Take the following hypothetical RCT: Treated subjects have a 25% chance of dying during the 2-year study vs. placebo subjects have a 50% chance of dying. TRUE RR= 25%/50% = .50 (treated have 50% less chance of dying) You do a 2-yr RCT of 100 treated and 100 placebo subjects. If nobody switched, you would see about 25 deaths in the treated group and about 50 deaths in the placebo group (give or take a few due to random chance). ∴Observed RR≅ .50
  • 39. 39 Example, continued BUT, if early in the study, 25 treated subjects switch to placebo and 25 placebo subjects switch to control. You would see about 25*.25 + 75*.50 = 43-44 deaths in the placebo group And about 25*.50 + 75*.25 = 31 deaths in the treated group Observed RR = 31/44 ≅ .70 Diluted effect!
  • 40. 40 References Paul Allison. Survival Analysis Using SAS. SAS Institute Inc., Cary, NC: 2003.

Notas del editor

  1. Kristin Sainani Ph.D.http://www.stanford.edu/~kcobbStanford UniversityDepartment of Health Research and Policy
  2. Note: also the recent diabetes study in a Swedish population, where they reduced heart disease 50% among type II diabetics. Very tough intervention. Multiple drugs. Attempts to quit smoking. Diet and exercise. Not everyone participated fully; for example, smoking cessation was utter failure. If everyone had participated more fully in the regime, there might have even been a stronger effect.
  3. See all lectures from this course