Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Confounding, politics, frustration and knavish tricks

202 visualizaciones

Publicado el

2008 Bradford Hill Lecture. An explanation of some problems with the propensity score and why its supposed superiority to ANCOVA is doubtful

Publicado en: Datos y análisis
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Confounding, politics, frustration and knavish tricks

  1. 1. Bradford Hill lecture 2008 1 of 48 Confounding, politics, frustration and knavish tricks Stephen Senn
  2. 2. Bradford Hill lecture 2008 2 of 48 "If when the tide is falling you take out water with a twopenny pail, you and the moon together can do a great deal” Bradford Hill, A., and Hill, I. D. (1990), (12th edition) Principles of Medical Statistics. p247
  3. 3. Bradford Hill lecture 2008 3 of 48 The Central Problem of Epidemiology • This is generally recognised to be confounding • Where experiments cannot be conducted we must make do with observational studies • There is also the risk that due to hidden confounders we will conclude causation when all we have is association • Hill was a (the) key figure in promoting randomised controlled trials (RCTs) • But he also recognised that RCTs were not enough and was a pioneer of observational studies – Case control as in Doll and Hill (1950) – Cohort as in Doll and Hill (1954)
  4. 4. Bradford Hill lecture 2008 4 of 48 Outline • Some statistics of the propensity score • An explanation of the propensity score • Comparison to ANCOVA • Some criticisms • Conclusions Acknowledgement This is based on joint work with Erika Graf and Angelika Caputo Senn, S., Graf, E., and Caputo, A. (2007), "Stratification for the Propensity Score Compared with Linear Regression Techniques to Assess the Effect of Treatment or Exposure," Statistics in Medicine, 26, 5529-5544.
  5. 5. Bradford Hill lecture 2008 5 of 48 A Question for you to Consider • Consider these two experiments – A completely randomised trial – Patients allocated with 50% probability to A or B – Randomised matched pairs – Member of any pair randomised with 50% probability to A or B • In analysing, would you ignore the matching in the second case?
  6. 6. Bradford Hill lecture 2008 6 of 48 Propensity score: background • Due to Rosenbaum and Rubin, Biometrika 1983 • Has been cited over 1000 times since first published • Citation rate has grown rapidly since 1995 and is now more than 200 per year
  7. 7. Bradford Hill lecture 2008 7 of 48 This model predicting more than 300 citations in 2008 Annual citations of RosenbaumandRubin 50 150 0 100 1990 2000 2005 200 1995 250 1985 Year Citations Fit Data
  8. 8. Bradford Hill lecture 2008 8 of 48 Cumulativecitations of RosenbaumandRubin 0 1985 1000 600 200 2005 1200 400 800 200019951990 Year Cumulative
  9. 9. Bradford Hill lecture 2008 9 of 48 MEDICINE, GENERAL & INTERNAL (5.67%) ECONOMICS (19.45%) MATHEMATICAL & COMPUTATIONAL BIOLOGY (6.63%) CARDIAC & CARDIOVASCULAR SYSTEMS (10.69%) SOCIAL SCIENCES, MATHEMATICAL METHODS (8.99%) PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH (12.75%) SURGERY (6.48%) HEALTH CARE SCIENCES & SERVICES (6.34%) RESPIRATORY SYSTEM (5.75%) STATISTICS & PROBABILITY (17.24%)
  10. 10. Bradford Hill lecture 2008 10 of 48 Propensity Score Explanation • We consider two ‘treatments’ or exposures a subject might have received • The assignment indicator is X – X = 0, if subject receives exposure 0 – X = 1, if subject receives exposure 1 • There is a vector of covariates W
  11. 11. Bradford Hill lecture 2008 11 of 48 Counterfactual responses • For every subject we have two responses – ro – r1 • One of these will be observed • One of these is unobserved – Counterfactual
  12. 12. Bradford Hill lecture 2008 12 of 48 Propensity score: definition ( ) ( )1e W P X W= = This is a form of balancing score b(W). A balancing score is defined as follows. If r0 is the response given by a subject that is unexposed (indexed by 0) and r1 is the response when the same subject is exposed (indexed by 1), and ( ) ( ) ( )0 1 0 1, ,r r X W and r r X b W⊥ ⊥ then b(W) is a balancing score. R & R show that the finest such score is W itself and the coarsest is the propensity score
  13. 13. Bradford Hill lecture 2008 13 of 48 Propensity score uses • Calculate the propensity score for each subject • Stratify by the propensity score – In practice fifths are used • The resulting estimator is unbiased – The possible confounding influence of W has been eliminated
  14. 14. Bradford Hill lecture 2008 14 of 48 Exposure A B Total Young 240 80 320Male Old 60 20 80 Young 80 240 320Female Old 20 60 80 Total 400 400 800 Propensity Score:An Example Disposition of subjects in a study
  15. 15. Bradford Hill lecture 2008 15 of 48 Exposure Sex A B Total Male 300 100 400 Female 100 300 400 400 400 800 Exposure Age A B Total Young 320 320 640 Old 80 80 160 400 400 800 Sex is predictive of exposure but age is not
  16. 16. Bradford Hill lecture 2008 16 of 48 Class Relative frequency (A) ‘Probability’ of Disposition (to A) Young males 240/320 3/4 Old males 60/80 3/4 Young females 80/320 1/4 Old females 20/80 1/4 The philosophy of the propensity score is to stratify by probability of allocation. In this case this is equivalent to stratifying by sex.
  17. 17. Bradford Hill lecture 2008 17 of 48 Treatment Sex A B Difference Male 96 136 40 Female 96 136 40 Treatment Age A B Difference Young 100 140 40 Old 80 120 40 Response Age is predictive of outcome but sex is not
  18. 18. Bradford Hill lecture 2008 18 of 48 The Difference to Conventional Approaches • Conventional approaches correct for covariates if they are predictive of outcome – Analysis of covariance – Stratification • The propensity score corrects if covariates are predictive of assignment (allocation) • In this example correcting either for sex (propensity score) or age (ANCOVA) will produce an “unbiased” estimate
  19. 19. Bradford Hill lecture 2008 19 of 48 In terms of linear regression UVβLet be the marginal regression of U on V be the conditional regression of U on V given TTUV .Let β )2(0 or )1(0 if . . ..   = = =∴ += WX XYW WYXYX WXXYWWYXYX β β ββ ββββ (1) Is the analysis of covariance condition for not including something in the model and (2) is the propensity score condition. To define some general notation Now consider a specific implementation where Y is outcome X is treatment and W is covariate
  20. 20. Bradford Hill lecture 2008 20 of 48 Some myths of the propensity score • Colinearity of predictors makes traditional regression adjustments unusable • Quintile stratification on the propensity score eliminates bias more effectively than ANCOVA • The propensity score can be more efficient than ANCOVA • The coarsening property of the propensity score benefits efficiency
  21. 21. Bradford Hill lecture 2008 21 of 48 Colinearity of Predictors Consider a simple example in which the following predictor pattern is repeated a number of times Covariate/Confounder Exposure W1 W2 X 0 0 0 0 0 1 1 1 0 1 1 1 Clearly the effects of W1 and W2 are not identifiable but the effect of X is and any decent statistical package should be able to estimate the effect even if W1 and W2 are in the model. In the following example it is supposed that ( ) 1 2 0,1 Y W W X N ε ε = + + + : And that we have the same basic pattern of predictors for 1000 observations
  22. 22. Bradford Hill lecture 2008 22 of 48 Analysis with GenStat 1 Case where W1 and W2 are completely colinear Message: term W2 cannot be included in the model because it is aliased with terms already in the model. (W2) = (W1) Regression analysis Estimates of parameters Parameter estimate s.e. t(997) t pr. Constant -0.0266 0.0542 -0.49 0.624 W1 2.0067 0.0626 32.05 <.001 X 1.0377 0.0626 16.57 <.001
  23. 23. Bradford Hill lecture 2008 23 of 48 Analysis with GenStat 2 Case where W1 and W2 are strongly colinear ( a small bit of noise added to W2) Regression analysis Estimates of parameters Parameter estimate s.e. t(996) t pr. Constant -0.0270 0.0542 -0.50 0.619 W1 -0.82 3.16 -0.26 0.795 W2e 2.83 3.16 0.89 0.372 X 1.0372 0.0626 16.56 <.001 Message: the variance of some parameter estimates is seriously inflated, due to near collinearity or aliasing between the following parameters, listed with their variance inflation factors. W1 2553.00 W2e 2553.00
  24. 24. Bradford Hill lecture 2008 24 of 48 Better at eliminating bias? • Some papers have purported to show this • Claims have been demonstrated using simulation • But the simulations have been unfair – For example using models of different implicit complexity • It is trivial to produce examples where quintile stratification does not work – Suppose a baseline covariate differs by one standard deviation between exposures and outcome is a linear function of this • ANCOVA works perfectly, propensity score is biased
  25. 25. Bradford Hill lecture 2008 25 of 48 More efficient than ANCOVA ? • Stratification by probability of assignment • But ANCOVA stratifies by predictors of outcome; not assignment. • By definition residual variance less for ANCOVA • By definition, loss of orthogonality greater for propensity. • Consequence: variance of estimators higher for propensity score • Propensity score incoherent?
  26. 26. Bradford Hill lecture 2008 26 of 48 Furthermore • The coarseness property of the propensity score is completely irrelevant • There is no gain in efficiency through this property • The loss in orthogonality is equivalent to fitting all covariates and their interactions with each other. • You might as well just use (multivariate) W.
  27. 27. Bradford Hill lecture 2008 27 of 48 A Regression Reminder [ ] ( ) ( ) 1 2 00 2 ˆvar XX Let P X W β P P a a σ σ − = ′=    ÷  ÷=  ÷  ÷   L L M M M O M O The propensity score philosophy chooses the members of W in such a way that axx is maximised. Analysis of covariance chooses the members so that σ2 is minimised.
  28. 28. Bradford Hill lecture 2008 28 of 48 Another Example Young Old Total all ages Total X = 0 X = 1 X = 0 X = 1 X = 0 X = 1 Male 3 7 80 30 83 37 120 Female 8 42 9 21 17 63 80 Total both 11 49 89 51 100 100 Total 60 140 200
  29. 29. Bradford Hill lecture 2008 29 of 48 Another Example Young Old Total all ages Total X = 0 X = 1 X = 0 X= 1 X= 0 X = 1 Male 3 7 80 30 83 37 120 Female 8 42 9 21 17 63 80 Total both 11 49 89 51 100 100 Total 60 140 200 e(w) = 0.7
  30. 30. Bradford Hill lecture 2008 30 of 48 Propensity score stratification Exposure Assignment Total Stratum or strata Propensity score X = 0 X= 1 Old males e(W) = 0.27 80 30 110 Young males + Old females e(W) = 0.70 12 28 40 Young females e(W) = 0.84 8 42 50 Total 100 100 200
  31. 31. Bradford Hill lecture 2008 31 of 48 The last of these is the same as for the propensity score For our Second Example Factors in Model in Addition to Exposure Variance Multiplier, axx. None 0.0200 Age 0.0242 Sex 0.0257 sex + age 0.0267 sex + age + sex × age 0.0271
  32. 32. Bradford Hill lecture 2008 32 of 48 Conditional Distributions and The Propensity Score • The appropriateness of the propensity score is always illustrated in terms of the expectation of the treatment estimate – Unbiasedness in linear framework • Its suitability when looked at in terms of the full conditional distribution less obvious as will now be demonstrated
  33. 33. Bradford Hill lecture 2008 33 of 48 Suppose that we are interested in the conditional distribution of an outcome variable Y given a putative causal variable X and a further covariate W. We wish to investigate the circumstances under which W can be ignored. That is to say we wish to know the conditions that ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) . ,,1)4( )3( )2( )1( XWYXYW XWfYXWf XWf YXWf XYfXWYf XWf YXWf XYfXWYf YXWfXYfXfXWYfXWfXf YXWfXYfXfYXWf XWYfXWfXfYXWf ⊥⊥ =∩∴= ∩ =∩ ∩ =∩∴ ∩=∩ ∩=∩∩ ∩=∩∩ henceandimplieswhich andunless toequivalentgeneralinnotis(3)Now, (2)and(1)ofRHSEquating and Now, L L L L ( ) ( )f Y W X f Y X∩ =
  34. 34. Bradford Hill lecture 2008 34 of 48 Conclusion • The claims that are made for the propensity score are true in terms of conditional expectation (at least for the linear model) • However, they are not true in terms of the full conditional model • For W to be ignorable in that sense requires • This is the ANCOVA condition XWY ⊥
  35. 35. Bradford Hill lecture 2008 35 of 48 Implications for Modelling • It is not true that ignoring a covariate that is predictive of outcome but not assignment is acceptable • In the linear case estimators are unbiased but their variances are “incorrect” • More generally, however, conditional and unconditional estimators are different – Logistic regression, survival analysis
  36. 36. Bradford Hill lecture 2008 36 of 48 Y Z X4 X2 X3 X1 X5 X6 What should join Z in the model?
  37. 37. Bradford Hill lecture 2008 37 of 48 Y Z X4 X2 X3 X1 X5 X6 With inappropriate terms removed
  38. 38. Bradford Hill lecture 2008 38 of 48 Y Z X4 X2 X3 X1 X5 X6 Propensity score adjustment
  39. 39. Bradford Hill lecture 2008 39 of 48 Y Z X4 X2 X3 X1 X5 X6 ANCOVA adjustment
  40. 40. Bradford Hill lecture 2008 40 of 48 Non-linear example Simulation as before but binary response on Y >1.5 With balanced covariates antilog of Parameter estimate s.e. t(*) t pr. estimate Constant -2.442 0.185 -13.18 <.001 0.08696 W1 4.98 8.51 0.59 0.558 146.2 W2e -1.73 8.51 -0.20 0.839 0.1768 X 1.689 0.192 8.78 <.001 5.413 antilog of Parameter estimate s.e. t(*) t pr. estimate Constant -0.4642 0.0918 -5.06 <.001 0.6287 X 0.962 0.130 7.40 <.001 2.617
  41. 41. Bradford Hill lecture 2008 41 of 48 Not convinced? An Example • An open trial of the effect of alcohol consumption on the ability to memorize word lists • Volunteers to be drawn at random and divided into two groups • One lot to be given a glass of wine, the other a glass of water
  42. 42. Bradford Hill lecture 2008 42 of 48 Two Possible Approaches Experiment 1 Experiment 2 • A subject has name drawn at random • If chosen for control group, given blue ball • If chosen for treatment group given red ball • “All you who have a blue ball please come to receive your glass of water, red ball to receive your glass of wine” • A subject has name drawn at random • If chosen for control group given glass of beer to drink • Otherwise given nothing • “All you who have had a beer come to receive your glass of water, if you had nothing, to receive your glass of wine.”
  43. 43. Bradford Hill lecture 2008 43 of 48 Experiment 1 • Probability of receiving wine if ball blue = 0 • Probability of receiving wine if ball red = 1 • The propensity score takes on the values 0 and 1 • Do you have to stratify by the propensity score?
  44. 44. Bradford Hill lecture 2008 44 of 48 Experiment 2 • Probability of receiving wine if beer = 0 • Probability of receiving wine if no beer = 1 • The propensity score takes on the values 0 and 1 • Do you have to stratify by the propensity score?
  45. 45. Bradford Hill lecture 2008 45 of 48 The Difference? • The difference between these two experiments is not the propensity score • This is 0 and 1 in both cases and all subjects in both cases have a score of 0 and 1 • The difference is that in the first case the covariate used to construct the score is predictive of outcome and in the second it is not.
  46. 46. Bradford Hill lecture 2008 46 of 48 Consequence • It is association with outcome that is important – ANCOVA tradition • Not association with assignment – Propensity point of view
  47. 47. Bradford Hill lecture 2008 47 of 48 And that Question • Consider these two experiments – A completely randomised trial – Patients allocated with 50% probability to A or B – Randomised matched pairs – Member of any pair randomised with 50% probability to A or B • In analysing, would you ignore the matching in the second case? • The propensity score philosophy says you can!
  48. 48. Bradford Hill lecture 2008 48 of 48 Finally All scientific work is incomplete - whether it be observational or experimental. All scientific work is liable to be upset or modified by advancing knowledge. That does not confer upon us a freedom to ignore the knowledge we already have, or to postpone the action that it appears to demand at a given time. Sir Austin Bradford Hill , 1965

×