SlideShare una empresa de Scribd logo
1 de 53
Descargar para leer sin conexión
Missing data
Julian Higgins
MRC Biostatistics Unit
Cambridge, UK
with thanks to Ian White, Fred Wolf, Angela Wood, Alex Sutton
SMG advanced workshop, Cardiff, March 4-5, 2010
Potential sources of missing data
in a meta-analysis
Potential sources of missing data
in a meta-analysis
• Studies not found
• Outcome not reported
• Outcome partially reported (e.g. missing SD or SE)
• Extra information required for meta-analysis
(e.g. imputed correlation coefficient)
• Missing participants
• No information on study characteristic for heterogeneity
analysis
Concepts in missing dataConcepts in missing data
1. ‘Missing completely at random’
• As if missing observation was randomly picked
• Missingness unrelated to the true (missing) value
• e.g.
– (genuinely) accidental deletion of a file
– observations measured on a random sample
– statistics not reported due to ignorance of their
importance(?)
• OK (unbiased) to analyse just the data available, but
sample size is reduced - which may not be ideal
• Not very common!
Concepts in missing dataConcepts in missing data
2. ‘Missing at random’
• Missingness depends on things you know about, and
not on the missing data themselves
• e.g.
– older people more likely to drop out (irrespective of
the effect of treatment)
– recent cluster trials more likely to report intraclass
correlation
– experimental treatment more likely to cause drop-
out (independent of therapeutic effect)
• Usually surmountable
Concepts in missing dataConcepts in missing data
3. ‘Informatively missing’
• The fact that an observation is missing is a
consequence of the value of the missing observation
• e.g.
– publication bias (non-significant result leads to
missingness)
– drop-out due to treatment failure (bad outcome
leads to missingness)
– selective reporting
• methods not reported because they were poor
• unexpected or disliked findings suppressed
• Requires careful consideration
• Very common!
Strategies for dealing with missing dataStrategies for dealing with missing data
• Ignore (exclude) the missing data
– Generally unwise
• Obtain the missing data
– Contact primary investigators
– Compute from available information
• Re-interpret the analysis
– Focus on sub-population, or place results in context
• Simple imputation
– ‘Fill in’ a value for each missing datum
• Multiple imputation
– Impute several times from a random distribution and
combine over results of multiple analyses
• Analytical techniques
– e.g. EM algorithm, maximum likelihood techniques
Strategies for dealing with missing dataStrategies for dealing with missing data
• Do sensitivity analyses
– But what if it makes a big difference?
Studies not foundStudies not found
Missing studiesMissing studies
• If studies are informatively missing, this is
publication bias
• If:
1. a funnel plot is asymmetrical
2. publication bias is assumed to be the cause
3. a fixed-effect of random-effects assumption is
reasonable
• then trim and fill provides a simple imputation method
– but I think these assumptions are difficult to believe
– I favour extrapolation of a regression line
Moreno, Sutton, Turner, Abrams, Cooper, Palmer and Ades. BMJ 2009; 339: b2981
The trim and fill methodThe trim and fill method
• Formalises the use of the funnel plot
• Rank-based ‘data augmentation’ technique
– estimates and adjusts for the number and outcomes of
missing studies
• Relatively simple to implement
• Simulations suggest it performs well
Duval and Tweedie. Biometrics 2000; 56: 455–463
The trim and fill method (cont)The trim and fill method (cont)
• Funnel plot of the
effect of gangliosides
in acute ischaemic
stroke
The trim and fill method (cont)The trim and fill method (cont)
• Estimate number and
trim asymmetric
studies
The trim and fill method (cont)The trim and fill method (cont)
• Calculate ‘centre’ of
remainder
The trim and fill method (cont)The trim and fill method (cont)
• Replace the trimmed
studies and fill with
their missing mirror
image counterparts to
estimate effect and
its variance
Extrapolation of funnel plot
regression line
Extrapolation of funnel plot
regression line
Odds ratio
0.1 0.3 1 3
3
2
1
0
100.6
Standarderror
Maximum
observed precision
or
Infinite precision
Outcome not reportedOutcome not reported
• Wait until tomorrow
Outcome partially reportedOutcome partially reported
Missing standard deviationsMissing standard deviations
• Make sure these are computed where possible (e.g. from
T statistics, F statistics, standard errors, P values)
– For non-exact P values (e.g. P < 0.05, P > 0.05)?
• SDs are not necessary: the generic inverse variance
method in RevMan can analyse estimates and SEs
• Where SDs are genuinely missing and needed, consider
imputation
– borrow from a similar study?
– is it better to impute than to leave the study out of the
meta-analysis?
Missing standard deviationsMissing standard deviations
“It was decided that missing standard deviations for caries
increments that were not revealed by contacting the
original researchers would be imputed through linear
regression of log(standard deviation)s on log(mean
caries) increments. This is a suitable approach for caries
prevention studies since, as they follow an approximate
Poisson distribution, caries increments are closely related
to their standard deviations (van Rijkom 1998).”
Marinho, Higgins, Logan and Sheiham, CDSR 2003, Issue 1. Art. No.: CD002278
Sensitivity analysis
Asthma self management & ER visits BMJ 2003, from Fred Wolf
Sensitivity analysis
Asthma self management & ER visits BMJ 2003, from Fred Wolf
N = 7 studies (n=704)
(no missing data studies)
• SMD (CI)
-0.25 (-0.40, -0.10)
• Larger effect size
(over estimation?)
• Less precision
(more error?)
N = 12 studies (n=1114)
(5 with imputed data)
• SMD (CI)
-0.21 (-0.33, -0.09)
• Smaller effect size
• Greater precision
• > statistical Power
(> # studies & subjects)
Other possible sensitivity analyses:
vary imputation methods & assumptions
Extra information required for
meta-analysis
Extra information required for
meta-analysis
Missing correlation coefficientsMissing correlation coefficients
• Common problem for change-from-baseline / ANCOVA,
cross-over trials, cluster-randomized trials, combining or
comparing outcomes / time points, multivariate meta-
analysis
• If analysis fails to account for pairing or clustering, can
adjust it using an imputed correlation coefficient
– Correlation can often be computed for cross-over trials
from paired and unpaired results
• then lent to another study that doesn’t report paired results
– For cluster-randomized trials, ICC resources exist:
• Campbell et al, Statistics in Medicine 2001; 20:391-9
• Ukoumunne et al, Health Technology Assessment 1999; 3 (no 5)
• Health Services Research Unit Aberdeen
www.abdn.ac.uk/hsru/epp/cluster.shtml
Missing correlation coefficientsMissing correlation coefficients
• Guidance on change from baseline, cross-over trials,
cluster-randomized trials available in the Handbook
• See also Follmann (JCE, 1992)
Missing participantsMissing participants
Missing outcomes from individual
participants
Missing outcomes from individual
participants
• Consider first a dichotomous outcome
• From each trial, a 32 table
nCmCfCrCControl
nTmTfTrTTreatment
TotalMissingFailureSuccess
Haloperidol for schizophrenia (Cochrane review)Haloperidol for schizophrenia (Cochrane review)
2 52
22 69
1 30
0 13
0 21
0 19
1 26
0 17
2 66
0 11
3 38
0 29
11 29
0 15
0 17
0 12
1 31)
0 51
34 68
1 31
0 13
0 22
0 15
1 26
0 13
2 66
0 11
0 14
0 11
18 29
1 15
1 9
0 12
1 31
mT nT
mC nC
Haloperidol Placebo
Risk ratio for clinical improvement
0.1 1 10 50
Study % Weight
Arvanitis
Beasley
Bechelli
Borison
Chouinard
Durost
Garry
Howard
Marder
Nishikawa 82
Nishikawa 84
Reschke
Selman
Serafetinides
Simpson
Spencer
Vichaiya
18.9
31.2
2.0
0.5
3.1
1.1
3.4
3.3
11.4
0.4
0.5
2.5
19.1
0.5
0.5
1.1
0.5
1.57 (1.28,1.92)Overall (FE, 95% CI)
Favours placebo Favours placebo
Typical practice in Cochrane reviewsTypical practice in Cochrane reviews
• Ignore missing data completely
– Call the above an available case analysis
– often called complete case analysis
• Impute success, failure, best-case, worst-case etc
– Call this an imputed case analysis
– analysis often proceeds assuming imputed data are
known
Graphical interpretationGraphical interpretation
Failure
Success
Missing
Treatment Control
Available case analysisAvailable case analysis
• (a.k.a. complete case analysis)
• Ignore missing data
• May be biased
• Over-precise
– Consider a trial of 80 patients,
with 60 followed up. We
should be more uncertain with
60/80 of data than with 60/60
of data
Treatment Control
vs
Imputation strategies (1/5)Imputation strategies (1/5)
• Impute all successes • Impute all failures
Treatment ControlTreatment Control
Imputation strategies (2/5)Imputation strategies (2/5)
• Best case for treatment • Worst case for treatment
Treatment ControlTreatment Control
Imputation strategies (3/5)Imputation strategies (3/5)
• Impute treatment rate • Impute control rate
Treatment ControlTreatment Control
Imputation strategies (4/5)Imputation strategies (4/5)
• Impute group-specific rate
Treatment Control
Using reasons for missingnessUsing reasons for missingness
• Sometimes reasons for missing data are available
• Use these to impute particular outcomes for missing
participants
• e.g. in haloperidol trials:
– Lack of efficacy, relapse: impute failure
– Positive response: impute success
– Adverse effects, non-compliance: impute control event rate
– Loss to follow-up, administrative
reasons, patient sleeping: impute group-specific event rate
Higgins , White and Wood. Clinical Trials 2008; 5: 225–239
Application to haloperidolApplication to haloperidol
1.8 (1.4, 2.1) Q=22
1.5 (1.2, 1.7) Q=31
1.3 (1.1, 1.5) Q=34
1.4 (1.2, 1.6) Q=31
0.9 (0.8, 1.1) Q=62
2.4 (1.9, 3.0) Q=22
1.9 (1.5, 2.4) Q=22
1.2 (1.0, 1.3) Q=40
1.6 (1.3, 1.9) Q=27 (16 df)
Fixed-effect meta-analysis
incorporating available reasons for
missing data
according to observed group-specific
event rate
according to observed treatment
event rate, pT
according to observed control event
rate, pC
missing = success (1)
missing = failure (0)
Imputation
worst case scenario for treatment
best case scenario for treatment
nothing (available case analysis)
Application to haloperidolApplication to haloperidol
Imputation
1.3 (28%)
1.0 (36%)
1.0 (25%)
1.0 (37%)
0.5 (33%)
2.5 (30%)
1.4 (25%)
0.9 (36%)
1.0 (31%)
RR (weight)
1.8 (22%)
1.5 (32%)
1.1 (51%)
1.3 (27%)
0.7 (26%)
4.0 (11%)
2.4 (10%)
1.1 (47%)
1.5 (19%)
RR (weight)
incorporating available reasons for
missing data
according to observed group-specific
event rate
according to observed treatment
event rate, pT
according to observed control event
rate, pC
missing = success (1)
missing = failure (0)
worst case scenario for treatment
best case scenario for treatment
nothing (available case analysis)
SelmanBeasley
Generalization of imputing schemesGeneralization of imputing schemes
• Consider control group only
• To reflect informative missingness,
specify odds ratio comparing
event rate among missing participants
vs
event rate among observed participants
• Call this informative missingness odds
ratio (IMORC)
• Similarly for treatment group (IMORT)
vs
incorporating available reasons for
missing data
11according to observed group-specific
event rate
1all according to observed treatment
event rate, pT
1all according to observed control
event rate, pC
missing = success (1)
00missing = failure (0)
IMORCIMORTImputation
 
 


C T
C T
p 1 p
1 p p
 
 
T C
T C
p 1 p
1 p p


 
 
M
T T
M
T T
p 1 p
1 p p


 
 
M
C C
M
C C
p 1 p
1 p p


0 [or ] [or 0]
Impute to create worst case scenario
for treatment
 [or 0]0 [or ]
to create best case scenario for
treatment
ConnectionsConnections
Weighting studiesWeighting studies
• Suppose we were to treat imputed data as known
– Standard errors will tend to be too small and weights too
big
• Some simple alternative weights might be used
– use the available case weights
– use an effective sample size argument, but with revised
event rates
• We have derived more suitable weights based on IMORs: an
analytic strategy
metamiss in Statametamiss in Stata
• All the above strategies are implemented in our program
metamiss
• Download it within Stata using
ssc install metamiss
• or download the latest version using
net from http://www.mrc-
bsu.cam.ac.uk/BSUsite/
Software/pub/software/stata/meta
White and Higgins. The Stata Journal 2009; 9: 57-69.
Overall (I-squared = 26.8%, p = 0.148)
Selman
Nishikawa_82
Borison
Howard
ID
Chouinard
Reschke
Marder
Simpson
Serafetinides
Durost
Garry
Spencer
Nishikawa_84
Beasley
Bechelli
Vichaiya
Arvanitis
Study
1.2 1 5
Risk ratio, intervention vs. control
Haloperidol meta-analysis: using
reasons
Haloperidol meta-analysis: using
reasons
Implications for practiceImplications for practice
• We desire a strategy for primary analysis and for
sensitivity analyses
• Primary analysis
– present available case analysis as a reference point
– make use of available reasons for missingness
– consider using IMORs based on external evidence or
heuristic arguments
Implications for practiceImplications for practice
• Sensitivity analysis
– Best-case/worst case scenarios are fine but probably
too unrealistic
– We suggest to use IMORs to move towards these
analyses, using clinically realistic values
– Should also evaluate impact of changing weights
Treatmenteventrate
Control event rate
0 0.25 0.5 0.75 1
0
0.25
0.5
0.75
1
all eventsbest-case
worst-caseno events
Application to haloperidolApplication to haloperidol
1.8 (1.4, 2.2) Q=22
1.4 (1.1, 1.7) Q=34
1.7 (1.4, 2.1) Q=25
1.4 (1.2, 1.7) Q=31
1.6 (1.3, 1.9) Q=27 (16 df)
Fixed-effect meta-analysis
IMORT = 1/2, IMORC = 1/2
IMORT = 1/2, IMORC = 2
Imputation
Available case
IMORT = 2, IMORC = 1/2
IMORT = 2, IMORC = 2
Conclusion: no important impact of missing data
Missing data introduce extra uncertaintyMissing data introduce extra uncertainty
• Even if we make assumptions about the missing data, we
don’t know if they are correct
• For example, we don’t know if IMOR is 0 or 1.
• Allowing for missing data should introduce extra
uncertainty
– increase the standard error
– hence down-weight trials with more missing data in the
meta-analysis
• We do this by allowing for uncertainty in the IMORs
– sdlogimor(1) option in metamiss specifies a sensible
degree of uncertainty about the IMOR
White, Wood and Higgins. Statistics in Medicine 2008; 27: 711–727.
sdlogimor(): technical detailssdlogimor(): technical details
• We place a normal distribution on the log IMOR
• For example, we might give the log IMOR mean -1 and
standard deviation 1
– our “best guess” is log IMOR = -1 (IMOR = 0.37)
– we are 68% sure the log IMOR lies between -2 and
0 (IMOR lies between 0.14 and 1)
– we are 95% sure the log IMOR lies between -3 and
+1 (IMOR lies between 0.05 and 2.7)
Continuous dataContinuous data
• Many methods available to primary trialists, e.g.
– last observation carried forward
– regression imputation
– analytic approaches
• Imputing is much more difficult for the meta-analyst
• You can
– Impute using the mean (corresponds to increasing
sample size to include missing people)
– Impute using a specific value
– Apply an ‘IMMD’ or ‘IMMR’
(‘informative missingness mean difference’ or ‘…mean
ratio’)
• Methods to properly account for uncertainty are not yet
developed
RemarksRemarks
• Many imputing schemes are available, but these should
not be used to enter ‘filled-out’ data into RevMan
• However, the point estimates from such analyses may be
used with weights from an available case analysis
(requires analyses outside of RevMan)
Remarks (ctd)Remarks (ctd)
• Informative missingness odds ratios offer advantages of
– a generalization of the imputation schemes
– can reflect ‘realistic’ scenarios
– statistical expressions for variances
– ability to incorporate prior distributions on IMORs
• Sensitivity analyses are essential, and should ideally
address both estimates and weights
• See: Higgins JPT, White IR, Wood AM. Missing outcome data in meta-
analysis of clinical trials: development and comparison of methods, with
suggestions for practice. (Clinical Trials)
No information on study characteristic
for heterogeneity analysis
No information on study characteristic
for heterogeneity analysis
General recommendations for dealing
with missing data
General recommendations for dealing
with missing data
• Whenever possible, contact original investigators to
request missing data
• Make explicit assumptions of methods used to address
missing data
• Conduct sensitivity analyses to assess how sensitive
results are to reasonable changes in assumptions that
are made
• Address potential impact of missing data (known or
suspected) on findings of the review in the Discussion
section

Más contenido relacionado

La actualidad más candente

Imran rizvi statistics in meta analysis
Imran rizvi statistics in meta analysisImran rizvi statistics in meta analysis
Imran rizvi statistics in meta analysisImran Rizvi
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencycheweb1
 
Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test SelectionVaggelis Vergoulas
 
2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical Devices2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical DevicesTerry Liao
 
Statistics in meta analysis
Statistics in meta analysisStatistics in meta analysis
Statistics in meta analysisDr Shri Sangle
 
Meta-analysis when the normality assumptions are violated (2008)
Meta-analysis when the normality assumptions are violated (2008)Meta-analysis when the normality assumptions are violated (2008)
Meta-analysis when the normality assumptions are violated (2008)Evangelos Kontopantelis
 
Cochrane Collaboration
Cochrane CollaborationCochrane Collaboration
Cochrane CollaborationNinian Peckitt
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantPRAJAKTASAWANT33
 
2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample size2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample sizesalummkata1
 
Meta analysis
Meta analysisMeta analysis
Meta analysisSethu S
 
Analytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataAnalytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataCTSI at UCSF
 
Statistical methods for the life sciences lb
Statistical methods for the life sciences lbStatistical methods for the life sciences lb
Statistical methods for the life sciences lbpriyaupm
 
Absence of a gold standard in diagnostic test accuracy research
Absence of a gold standard in diagnostic test accuracy researchAbsence of a gold standard in diagnostic test accuracy research
Absence of a gold standard in diagnostic test accuracy researchMaarten van Smeden
 
Alternatives to t test
Alternatives to t testAlternatives to t test
Alternatives to t testLONDIWE SHANGE
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Julián Urbano
 
Parmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsParmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsVinod Pagidipalli
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overviewAzmi Mohd Tamil
 

La actualidad más candente (20)

Imran rizvi statistics in meta analysis
Imran rizvi statistics in meta analysisImran rizvi statistics in meta analysis
Imran rizvi statistics in meta analysis
 
Analysis and Interpretation
Analysis and InterpretationAnalysis and Interpretation
Analysis and Interpretation
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistency
 
Statistical test
Statistical testStatistical test
Statistical test
 
Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test Selection
 
2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical Devices2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical Devices
 
Statistics in meta analysis
Statistics in meta analysisStatistics in meta analysis
Statistics in meta analysis
 
Meta-analysis when the normality assumptions are violated (2008)
Meta-analysis when the normality assumptions are violated (2008)Meta-analysis when the normality assumptions are violated (2008)
Meta-analysis when the normality assumptions are violated (2008)
 
Cochrane Collaboration
Cochrane CollaborationCochrane Collaboration
Cochrane Collaboration
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
 
Seminar in Meta-analysis
Seminar in Meta-analysisSeminar in Meta-analysis
Seminar in Meta-analysis
 
2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample size2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample size
 
Meta analysis
Meta analysisMeta analysis
Meta analysis
 
Analytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataAnalytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational Data
 
Statistical methods for the life sciences lb
Statistical methods for the life sciences lbStatistical methods for the life sciences lb
Statistical methods for the life sciences lb
 
Absence of a gold standard in diagnostic test accuracy research
Absence of a gold standard in diagnostic test accuracy researchAbsence of a gold standard in diagnostic test accuracy research
Absence of a gold standard in diagnostic test accuracy research
 
Alternatives to t test
Alternatives to t testAlternatives to t test
Alternatives to t test
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
 
Parmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsParmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trails
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overview
 

Similar a 2010 smg training_cardiff_day1_session3_higgins

Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatisticsAli Al Mousawi
 
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -sampling errors  and non-sampling errorsStat 3203 -sampling errors  and non-sampling errors
Stat 3203 -sampling errors and non-sampling errorsKhulna University
 
Analysis-of-data-with-missing-values.pptx
Analysis-of-data-with-missing-values.pptxAnalysis-of-data-with-missing-values.pptx
Analysis-of-data-with-missing-values.pptxAASTHAJAJOO
 
Probability and data 1w
Probability and data 1wProbability and data 1w
Probability and data 1wKyoungilYoon
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalIstiqlalEid
 
Marketing Research Project on T test
Marketing Research Project on T test Marketing Research Project on T test
Marketing Research Project on T test Meghna Baid
 
Some nonparametric statistic for categorical &amp; ordinal data
Some nonparametric statistic for categorical &amp; ordinal dataSome nonparametric statistic for categorical &amp; ordinal data
Some nonparametric statistic for categorical &amp; ordinal dataRegent University
 
4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdf4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdfBijayThapa30
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyKern Rocke
 
Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excelParag Shah
 
How to Structure the “Approach” Section of a Grant Application by David Elash...
How to Structure the “Approach” Section of a Grant Application by David Elash...How to Structure the “Approach” Section of a Grant Application by David Elash...
How to Structure the “Approach” Section of a Grant Application by David Elash...UCLA CTSI
 
How to Structure the “Approach” Section of a Grant Application (2020)
How to Structure the “Approach” Section of a Grant Application (2020)How to Structure the “Approach” Section of a Grant Application (2020)
How to Structure the “Approach” Section of a Grant Application (2020)UCLA CTSI
 
6.hypothesistesting.06
6.hypothesistesting.066.hypothesistesting.06
6.hypothesistesting.06MehwishFahad
 
Some statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisSome statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisUC Davis
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsGillian Byrne
 
Missing Data Analysis_Data Analysis Techniques
Missing Data Analysis_Data Analysis TechniquesMissing Data Analysis_Data Analysis Techniques
Missing Data Analysis_Data Analysis TechniquesPrakriti Chandan Sinha
 

Similar a 2010 smg training_cardiff_day1_session3_higgins (20)

Intro statistics
Intro statisticsIntro statistics
Intro statistics
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
 
LR 9 Estimation.pdf
LR 9 Estimation.pdfLR 9 Estimation.pdf
LR 9 Estimation.pdf
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -sampling errors  and non-sampling errorsStat 3203 -sampling errors  and non-sampling errors
Stat 3203 -sampling errors and non-sampling errors
 
Analysis-of-data-with-missing-values.pptx
Analysis-of-data-with-missing-values.pptxAnalysis-of-data-with-missing-values.pptx
Analysis-of-data-with-missing-values.pptx
 
Probability and data 1w
Probability and data 1wProbability and data 1w
Probability and data 1w
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
 
Marketing Research Project on T test
Marketing Research Project on T test Marketing Research Project on T test
Marketing Research Project on T test
 
Some nonparametric statistic for categorical &amp; ordinal data
Some nonparametric statistic for categorical &amp; ordinal dataSome nonparametric statistic for categorical &amp; ordinal data
Some nonparametric statistic for categorical &amp; ordinal data
 
4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdf4 Inferential Statistics IV - April 7 2014.pdf
4 Inferential Statistics IV - April 7 2014.pdf
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
 
Statistics
StatisticsStatistics
Statistics
 
Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excel
 
How to Structure the “Approach” Section of a Grant Application by David Elash...
How to Structure the “Approach” Section of a Grant Application by David Elash...How to Structure the “Approach” Section of a Grant Application by David Elash...
How to Structure the “Approach” Section of a Grant Application by David Elash...
 
How to Structure the “Approach” Section of a Grant Application (2020)
How to Structure the “Approach” Section of a Grant Application (2020)How to Structure the “Approach” Section of a Grant Application (2020)
How to Structure the “Approach” Section of a Grant Application (2020)
 
6.hypothesistesting.06
6.hypothesistesting.066.hypothesistesting.06
6.hypothesistesting.06
 
Some statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisSome statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysis
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statistics
 
Missing Data Analysis_Data Analysis Techniques
Missing Data Analysis_Data Analysis TechniquesMissing Data Analysis_Data Analysis Techniques
Missing Data Analysis_Data Analysis Techniques
 

Último

Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 

Último (20)

Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 

2010 smg training_cardiff_day1_session3_higgins

  • 1. Missing data Julian Higgins MRC Biostatistics Unit Cambridge, UK with thanks to Ian White, Fred Wolf, Angela Wood, Alex Sutton SMG advanced workshop, Cardiff, March 4-5, 2010
  • 2. Potential sources of missing data in a meta-analysis Potential sources of missing data in a meta-analysis • Studies not found • Outcome not reported • Outcome partially reported (e.g. missing SD or SE) • Extra information required for meta-analysis (e.g. imputed correlation coefficient) • Missing participants • No information on study characteristic for heterogeneity analysis
  • 3. Concepts in missing dataConcepts in missing data 1. ‘Missing completely at random’ • As if missing observation was randomly picked • Missingness unrelated to the true (missing) value • e.g. – (genuinely) accidental deletion of a file – observations measured on a random sample – statistics not reported due to ignorance of their importance(?) • OK (unbiased) to analyse just the data available, but sample size is reduced - which may not be ideal • Not very common!
  • 4. Concepts in missing dataConcepts in missing data 2. ‘Missing at random’ • Missingness depends on things you know about, and not on the missing data themselves • e.g. – older people more likely to drop out (irrespective of the effect of treatment) – recent cluster trials more likely to report intraclass correlation – experimental treatment more likely to cause drop- out (independent of therapeutic effect) • Usually surmountable
  • 5. Concepts in missing dataConcepts in missing data 3. ‘Informatively missing’ • The fact that an observation is missing is a consequence of the value of the missing observation • e.g. – publication bias (non-significant result leads to missingness) – drop-out due to treatment failure (bad outcome leads to missingness) – selective reporting • methods not reported because they were poor • unexpected or disliked findings suppressed • Requires careful consideration • Very common!
  • 6. Strategies for dealing with missing dataStrategies for dealing with missing data • Ignore (exclude) the missing data – Generally unwise • Obtain the missing data – Contact primary investigators – Compute from available information • Re-interpret the analysis – Focus on sub-population, or place results in context • Simple imputation – ‘Fill in’ a value for each missing datum • Multiple imputation – Impute several times from a random distribution and combine over results of multiple analyses • Analytical techniques – e.g. EM algorithm, maximum likelihood techniques
  • 7. Strategies for dealing with missing dataStrategies for dealing with missing data • Do sensitivity analyses – But what if it makes a big difference?
  • 9. Missing studiesMissing studies • If studies are informatively missing, this is publication bias • If: 1. a funnel plot is asymmetrical 2. publication bias is assumed to be the cause 3. a fixed-effect of random-effects assumption is reasonable • then trim and fill provides a simple imputation method – but I think these assumptions are difficult to believe – I favour extrapolation of a regression line Moreno, Sutton, Turner, Abrams, Cooper, Palmer and Ades. BMJ 2009; 339: b2981
  • 10. The trim and fill methodThe trim and fill method • Formalises the use of the funnel plot • Rank-based ‘data augmentation’ technique – estimates and adjusts for the number and outcomes of missing studies • Relatively simple to implement • Simulations suggest it performs well Duval and Tweedie. Biometrics 2000; 56: 455–463
  • 11. The trim and fill method (cont)The trim and fill method (cont) • Funnel plot of the effect of gangliosides in acute ischaemic stroke
  • 12. The trim and fill method (cont)The trim and fill method (cont) • Estimate number and trim asymmetric studies
  • 13. The trim and fill method (cont)The trim and fill method (cont) • Calculate ‘centre’ of remainder
  • 14. The trim and fill method (cont)The trim and fill method (cont) • Replace the trimmed studies and fill with their missing mirror image counterparts to estimate effect and its variance
  • 15. Extrapolation of funnel plot regression line Extrapolation of funnel plot regression line Odds ratio 0.1 0.3 1 3 3 2 1 0 100.6 Standarderror Maximum observed precision or Infinite precision
  • 16. Outcome not reportedOutcome not reported • Wait until tomorrow
  • 17. Outcome partially reportedOutcome partially reported
  • 18. Missing standard deviationsMissing standard deviations • Make sure these are computed where possible (e.g. from T statistics, F statistics, standard errors, P values) – For non-exact P values (e.g. P < 0.05, P > 0.05)? • SDs are not necessary: the generic inverse variance method in RevMan can analyse estimates and SEs • Where SDs are genuinely missing and needed, consider imputation – borrow from a similar study? – is it better to impute than to leave the study out of the meta-analysis?
  • 19. Missing standard deviationsMissing standard deviations “It was decided that missing standard deviations for caries increments that were not revealed by contacting the original researchers would be imputed through linear regression of log(standard deviation)s on log(mean caries) increments. This is a suitable approach for caries prevention studies since, as they follow an approximate Poisson distribution, caries increments are closely related to their standard deviations (van Rijkom 1998).” Marinho, Higgins, Logan and Sheiham, CDSR 2003, Issue 1. Art. No.: CD002278
  • 20. Sensitivity analysis Asthma self management & ER visits BMJ 2003, from Fred Wolf Sensitivity analysis Asthma self management & ER visits BMJ 2003, from Fred Wolf N = 7 studies (n=704) (no missing data studies) • SMD (CI) -0.25 (-0.40, -0.10) • Larger effect size (over estimation?) • Less precision (more error?) N = 12 studies (n=1114) (5 with imputed data) • SMD (CI) -0.21 (-0.33, -0.09) • Smaller effect size • Greater precision • > statistical Power (> # studies & subjects) Other possible sensitivity analyses: vary imputation methods & assumptions
  • 21. Extra information required for meta-analysis Extra information required for meta-analysis
  • 22. Missing correlation coefficientsMissing correlation coefficients • Common problem for change-from-baseline / ANCOVA, cross-over trials, cluster-randomized trials, combining or comparing outcomes / time points, multivariate meta- analysis • If analysis fails to account for pairing or clustering, can adjust it using an imputed correlation coefficient – Correlation can often be computed for cross-over trials from paired and unpaired results • then lent to another study that doesn’t report paired results – For cluster-randomized trials, ICC resources exist: • Campbell et al, Statistics in Medicine 2001; 20:391-9 • Ukoumunne et al, Health Technology Assessment 1999; 3 (no 5) • Health Services Research Unit Aberdeen www.abdn.ac.uk/hsru/epp/cluster.shtml
  • 23. Missing correlation coefficientsMissing correlation coefficients • Guidance on change from baseline, cross-over trials, cluster-randomized trials available in the Handbook • See also Follmann (JCE, 1992)
  • 25. Missing outcomes from individual participants Missing outcomes from individual participants • Consider first a dichotomous outcome • From each trial, a 32 table nCmCfCrCControl nTmTfTrTTreatment TotalMissingFailureSuccess
  • 26. Haloperidol for schizophrenia (Cochrane review)Haloperidol for schizophrenia (Cochrane review) 2 52 22 69 1 30 0 13 0 21 0 19 1 26 0 17 2 66 0 11 3 38 0 29 11 29 0 15 0 17 0 12 1 31) 0 51 34 68 1 31 0 13 0 22 0 15 1 26 0 13 2 66 0 11 0 14 0 11 18 29 1 15 1 9 0 12 1 31 mT nT mC nC Haloperidol Placebo Risk ratio for clinical improvement 0.1 1 10 50 Study % Weight Arvanitis Beasley Bechelli Borison Chouinard Durost Garry Howard Marder Nishikawa 82 Nishikawa 84 Reschke Selman Serafetinides Simpson Spencer Vichaiya 18.9 31.2 2.0 0.5 3.1 1.1 3.4 3.3 11.4 0.4 0.5 2.5 19.1 0.5 0.5 1.1 0.5 1.57 (1.28,1.92)Overall (FE, 95% CI) Favours placebo Favours placebo
  • 27. Typical practice in Cochrane reviewsTypical practice in Cochrane reviews • Ignore missing data completely – Call the above an available case analysis – often called complete case analysis • Impute success, failure, best-case, worst-case etc – Call this an imputed case analysis – analysis often proceeds assuming imputed data are known
  • 29. Available case analysisAvailable case analysis • (a.k.a. complete case analysis) • Ignore missing data • May be biased • Over-precise – Consider a trial of 80 patients, with 60 followed up. We should be more uncertain with 60/80 of data than with 60/60 of data Treatment Control vs
  • 30. Imputation strategies (1/5)Imputation strategies (1/5) • Impute all successes • Impute all failures Treatment ControlTreatment Control
  • 31. Imputation strategies (2/5)Imputation strategies (2/5) • Best case for treatment • Worst case for treatment Treatment ControlTreatment Control
  • 32. Imputation strategies (3/5)Imputation strategies (3/5) • Impute treatment rate • Impute control rate Treatment ControlTreatment Control
  • 33. Imputation strategies (4/5)Imputation strategies (4/5) • Impute group-specific rate Treatment Control
  • 34. Using reasons for missingnessUsing reasons for missingness • Sometimes reasons for missing data are available • Use these to impute particular outcomes for missing participants • e.g. in haloperidol trials: – Lack of efficacy, relapse: impute failure – Positive response: impute success – Adverse effects, non-compliance: impute control event rate – Loss to follow-up, administrative reasons, patient sleeping: impute group-specific event rate Higgins , White and Wood. Clinical Trials 2008; 5: 225–239
  • 35. Application to haloperidolApplication to haloperidol 1.8 (1.4, 2.1) Q=22 1.5 (1.2, 1.7) Q=31 1.3 (1.1, 1.5) Q=34 1.4 (1.2, 1.6) Q=31 0.9 (0.8, 1.1) Q=62 2.4 (1.9, 3.0) Q=22 1.9 (1.5, 2.4) Q=22 1.2 (1.0, 1.3) Q=40 1.6 (1.3, 1.9) Q=27 (16 df) Fixed-effect meta-analysis incorporating available reasons for missing data according to observed group-specific event rate according to observed treatment event rate, pT according to observed control event rate, pC missing = success (1) missing = failure (0) Imputation worst case scenario for treatment best case scenario for treatment nothing (available case analysis)
  • 36. Application to haloperidolApplication to haloperidol Imputation 1.3 (28%) 1.0 (36%) 1.0 (25%) 1.0 (37%) 0.5 (33%) 2.5 (30%) 1.4 (25%) 0.9 (36%) 1.0 (31%) RR (weight) 1.8 (22%) 1.5 (32%) 1.1 (51%) 1.3 (27%) 0.7 (26%) 4.0 (11%) 2.4 (10%) 1.1 (47%) 1.5 (19%) RR (weight) incorporating available reasons for missing data according to observed group-specific event rate according to observed treatment event rate, pT according to observed control event rate, pC missing = success (1) missing = failure (0) worst case scenario for treatment best case scenario for treatment nothing (available case analysis) SelmanBeasley
  • 37. Generalization of imputing schemesGeneralization of imputing schemes • Consider control group only • To reflect informative missingness, specify odds ratio comparing event rate among missing participants vs event rate among observed participants • Call this informative missingness odds ratio (IMORC) • Similarly for treatment group (IMORT) vs
  • 38. incorporating available reasons for missing data 11according to observed group-specific event rate 1all according to observed treatment event rate, pT 1all according to observed control event rate, pC missing = success (1) 00missing = failure (0) IMORCIMORTImputation       C T C T p 1 p 1 p p     T C T C p 1 p 1 p p       M T T M T T p 1 p 1 p p       M C C M C C p 1 p 1 p p   0 [or ] [or 0] Impute to create worst case scenario for treatment  [or 0]0 [or ] to create best case scenario for treatment ConnectionsConnections
  • 39. Weighting studiesWeighting studies • Suppose we were to treat imputed data as known – Standard errors will tend to be too small and weights too big • Some simple alternative weights might be used – use the available case weights – use an effective sample size argument, but with revised event rates • We have derived more suitable weights based on IMORs: an analytic strategy
  • 40. metamiss in Statametamiss in Stata • All the above strategies are implemented in our program metamiss • Download it within Stata using ssc install metamiss • or download the latest version using net from http://www.mrc- bsu.cam.ac.uk/BSUsite/ Software/pub/software/stata/meta White and Higgins. The Stata Journal 2009; 9: 57-69.
  • 41.
  • 42. Overall (I-squared = 26.8%, p = 0.148) Selman Nishikawa_82 Borison Howard ID Chouinard Reschke Marder Simpson Serafetinides Durost Garry Spencer Nishikawa_84 Beasley Bechelli Vichaiya Arvanitis Study 1.2 1 5 Risk ratio, intervention vs. control Haloperidol meta-analysis: using reasons Haloperidol meta-analysis: using reasons
  • 43. Implications for practiceImplications for practice • We desire a strategy for primary analysis and for sensitivity analyses • Primary analysis – present available case analysis as a reference point – make use of available reasons for missingness – consider using IMORs based on external evidence or heuristic arguments
  • 44. Implications for practiceImplications for practice • Sensitivity analysis – Best-case/worst case scenarios are fine but probably too unrealistic – We suggest to use IMORs to move towards these analyses, using clinically realistic values – Should also evaluate impact of changing weights
  • 45. Treatmenteventrate Control event rate 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 all eventsbest-case worst-caseno events
  • 46. Application to haloperidolApplication to haloperidol 1.8 (1.4, 2.2) Q=22 1.4 (1.1, 1.7) Q=34 1.7 (1.4, 2.1) Q=25 1.4 (1.2, 1.7) Q=31 1.6 (1.3, 1.9) Q=27 (16 df) Fixed-effect meta-analysis IMORT = 1/2, IMORC = 1/2 IMORT = 1/2, IMORC = 2 Imputation Available case IMORT = 2, IMORC = 1/2 IMORT = 2, IMORC = 2 Conclusion: no important impact of missing data
  • 47. Missing data introduce extra uncertaintyMissing data introduce extra uncertainty • Even if we make assumptions about the missing data, we don’t know if they are correct • For example, we don’t know if IMOR is 0 or 1. • Allowing for missing data should introduce extra uncertainty – increase the standard error – hence down-weight trials with more missing data in the meta-analysis • We do this by allowing for uncertainty in the IMORs – sdlogimor(1) option in metamiss specifies a sensible degree of uncertainty about the IMOR White, Wood and Higgins. Statistics in Medicine 2008; 27: 711–727.
  • 48. sdlogimor(): technical detailssdlogimor(): technical details • We place a normal distribution on the log IMOR • For example, we might give the log IMOR mean -1 and standard deviation 1 – our “best guess” is log IMOR = -1 (IMOR = 0.37) – we are 68% sure the log IMOR lies between -2 and 0 (IMOR lies between 0.14 and 1) – we are 95% sure the log IMOR lies between -3 and +1 (IMOR lies between 0.05 and 2.7)
  • 49. Continuous dataContinuous data • Many methods available to primary trialists, e.g. – last observation carried forward – regression imputation – analytic approaches • Imputing is much more difficult for the meta-analyst • You can – Impute using the mean (corresponds to increasing sample size to include missing people) – Impute using a specific value – Apply an ‘IMMD’ or ‘IMMR’ (‘informative missingness mean difference’ or ‘…mean ratio’) • Methods to properly account for uncertainty are not yet developed
  • 50. RemarksRemarks • Many imputing schemes are available, but these should not be used to enter ‘filled-out’ data into RevMan • However, the point estimates from such analyses may be used with weights from an available case analysis (requires analyses outside of RevMan)
  • 51. Remarks (ctd)Remarks (ctd) • Informative missingness odds ratios offer advantages of – a generalization of the imputation schemes – can reflect ‘realistic’ scenarios – statistical expressions for variances – ability to incorporate prior distributions on IMORs • Sensitivity analyses are essential, and should ideally address both estimates and weights • See: Higgins JPT, White IR, Wood AM. Missing outcome data in meta- analysis of clinical trials: development and comparison of methods, with suggestions for practice. (Clinical Trials)
  • 52. No information on study characteristic for heterogeneity analysis No information on study characteristic for heterogeneity analysis
  • 53. General recommendations for dealing with missing data General recommendations for dealing with missing data • Whenever possible, contact original investigators to request missing data • Make explicit assumptions of methods used to address missing data • Conduct sensitivity analyses to assess how sensitive results are to reasonable changes in assumptions that are made • Address potential impact of missing data (known or suspected) on findings of the review in the Discussion section