SlideShare una empresa de Scribd logo
1 de 41
Descargar para leer sin conexión
Meta-analysis of continuous data:
Final values, change scores, and
            ANCOVA

                      Jo McKenzie

School of Public Health and Preventive Medicine, Monash University,
                              Australia
              joanne.mckenzie@med.monash.edu.au


Cochrane Statistical Methods Training Course: Addressing advanced
               issues in meta-analytical techniques
                   4-5th March 2010, Cardiff, UK
Outline of the presentation
   Common questions about final values (FV), change scores (CS),
    and ANCOVA in meta-analysis.
   Analysis of a randomised trial:
       Example demonstrating how the analytical methods compare.
       Properties of the estimators:
         • Conditional and unconditional on baseline imbalance,
         • Preferred estimator.

   Meta-analysis:
       A small simulation study:
         • Key points.
       Advice from the Cochrane Handbook for Systematic Reviews of
        Interventions.
       A proposed hierarchy of options.
       An example.
       Some final thoughts.
                                                                      2
Common questions: SMG list
(October 2007)

“In a Cochrane Workshop I attended <snip> we were told to use
   the post test scores and SDs of continuous variables. When we
   asked about what to do in case there is a significant difference in
   pretest scores, they stated that if proper randomization is used,
   pretest scores should be similar and therefore we should only
   concern ourselves with the posttest scores. However, we all
   know that is not always the case in smaller studies.”

“I just spoke with <snip> and he strongly recommends using the
    change scores as this method better accounts for potential
    pretest differences.”

“… What do you recommend to authors in your CRG: use of
  final value or use of change scores?”

                                                                    3
Common questions

   There were 16 posts to the list following
    the initial post with the conclusion:
    “… no consensus exists in the SMG”




                                                4
Example comparing estimators
   Randomised trial carried out in the Ubon Ratchathani
    province NE Thailand.
   Aimed to test the efficacy of a seasoning powder fortified
    with micronutrients.
   Groups:
       Intervention: fortified seasoning powder added to instant
        wheat noodles or rice,
       Control: unfortified seasoning powder added to instant wheat
        noodles or rice.
   Data collected at baseline and follow-up (31 weeks).
   Primary outcome was anaemia (defined from the
    continuous variable haemoglobin).


                                                                    5
Post intervention haemoglobin vs baseline haemoglobin



                      160
                                                           Group
                                            Control           Intervention
 Post intervention haemoglobin (g/L)


                                            Mean control      Mean intervention
       100         12080     140




                                       60              80              100          120    140
                                                              Baseline haemoglobin (g/L)
Estimators

   Three commonly used estimators are the final value
    (FV), change score (CS) and analysis of covariance
    (ANCOVA) estimator.

    ˆFV  yint  yctrl
     CS  ( yint  yctrl )  ( xint  xctrl )
     ˆ

     ANCOVA  ( yint  yctrl )   ( xint  xctrl )
     ˆ

                   y
    where      
                   x                                    7
Observed and simulated data sets:
summary statistics


Dataset            Observed                             Follow-up
                   correlation
                                   Intervention group            Control group
                                 Mean        SD             Mean         SD
Observed data      0.629         121.0      10.1            120.5        9.5
Simulated data 1   0.061         121.2      10.8            120.6        8.8
Simulated data 2   0.567         121.2      10.8            120.6        8.8
Simulated data 3   0.943         121.1      10.5            120.5        9.0




                                                                                 8
Scatter plots of post intervention haemoglobin vs baseline
haemoglobin for observed and simulated data sets

                                                             (a) Observed data (corr = 0.629)                       (b) Simulated data 1 (corr = 0.061)
                                          100 120 140 160




                                                                                                  100 120 140 160
  Post intervention haemoglobin (g/L)
                                          80




                                                                                                  80
                                                            60      80    100    120     140                        60        80       100       120   140


                                                            (c) Simulated data 2 (corr = 0.567)                     (d) Simulated data 3 (corr = 0.943)
                                        100 120 140 160




                                                                                                  100 120 140 160
                                                                                                                                   Group
                                                                                                                         Control       Intervention
                                        80




                                                                                                  80



                                                            60     80     100    120     140                        60        80       100       120   140

                                                                                 Baseline haemoglobin (g/L)
Estimated intervention effects (95% CIs) calculated using
different analytical methods for the four data sets

                                                             (a) Observed data (corr = 0.629)              (b) Simulated data 1 (corr = 0.061)
  Estimated intervention effect on haemoglobin (g/L)


                                                                  p = 0.540 p = 0.001 p = 0.012                  p = 0.456 p = 0.037 p = 0.379
                                                       4




                                                                                                      4
                                                       2




                                                                                                      2
                                                       0




                                                                                                      0
                                                       -2




                                                                                                      -2
                                                                   SAFV      SACS     ANCOVA                      SAFV      SACS     ANCOVA


                                                            (c) Simulated data 2 (corr = 0.567)            (d) Simulated data 3 (corr = 0.943)
                                                                  p = 0.533 p = 0.003 p = 0.027                  p = 0.513 p = 0.000 p = 0.000
                                                       4




                                                                                                      4
                                                       2




                                                                                                      2
                                                       0




                                                                                                      0
                                                       -2




                                                                                                      -2




                                                                   SAFV      SACS     ANCOVA                      SAFV      SACS     ANCOVA

                                                                                            Analytical method
Some observations from
comparing the analytical methods
   Estimates of intervention effect:
        For a particular data set, the three estimators can produce
         different estimates of intervention effect.
          • When the correlation is close to 0,  ANCOVA   FV .
                                                 ˆ          ˆ

          • When the correlation is close to 1,  ANCOVA     CS
                                                  ˆ           ˆ     .
        Over the data sets (varying correlation), the ANCOVA estimate
         varies, however, a simple analysis of either change scores (SACS)
         or final values (SAFV) does not.
   Standard errors:
        The standard error (SE) of the FV estimator remains the same over
         all data sets.
        As the correlation increases, the SE of the CS estimator decreases.
         When the correlation < 0.5, the SE of CS estimator is > SE of the
         FV estimator. This is reversed when the correlation is > 0.5.
        For a particular data set, the SE of the ANCOVA estimate is smaller
         compared to the SEs of FV and CS.
                                                                             11
Relationship between the
estimators

            ˆANCOVA  ( yint  yctrl )   ( xint  xctrl )

    Assuming        y  x     :
                      2     2

       When the correlation is close to 0,   0 ;
        ˆANCOVA  ˆFV
       When the correlation is close to 1,   1 ;
        ˆANCOVA   CS
                    ˆ
   When there is minimal baseline imbalance; the
    three methods produce similar estimates since
    ( xint  xctrl )  0 .
                                                               12
Variance of the estimators

    Assuming  y ,int                y ,ctrl   x ,int   x ,ctrl  
                               2        2           2         2             2


                2 2
     FV
      2
              
                 n
               4 2
       2
        CS         1   
                n

                   2 2
     ANCOVA
      2
                 
                    n
                         
                        1  2     

                                                                                13
Statistical properties of the
estimators: unconditional inference

   How do the estimators perform over hypothetical
    repetitions of randomised trials where baseline
    imbalance randomly varies?
   Bias
       SAFV, SACS, and ANCOVA are unbiased.
       Type I error rates will be as designated.

   Precision
       ANCOVA is (generally) more efficient compared with
        SAFV, or a SACS.

                                                             14
Statistical properties of the
estimators: conditional inference
   How do the estimators perform over hypothetical
    repetitions of randomised trials with the same
    baseline imbalance?
   Bias
       ANCOVA is conditionally unbiased.
       SAFV is conditionally biased:
                                                     y   
            yint  yctrl | xint  xctrl      
                                                         xint  xctrl 
                                                           
                                                     x    
       SACS is conditionally biased:
                                                                        y   
            yint  yctrl   xint  xctrl  | xint  xctrl      
                                                                          1xint  xctrl 
                                                                              
                                                                          x  
                                                                                       15
Statistical properties of the
estimators: conditional inference

   Precision
       ANCOVA is (generally) more efficient
        compared to SAFV, or a SACS.




                                               16
Unconditional or conditional
inference?
   Some debate over which type of statistical inference is
    important [Senn 1989; Overall 1992; Wei 2001].
   “If I am cruising at 10,000m above sea level in mid-
    Atlantic and the captain informs me that three engines are
    on fire, I can hardly console myself with the thought that,
    on average, air travel is very safe” [Senn 1997].
   Most trialists are concerned about the potential impact of
    any observed baseline imbalance on their estimate of
    intervention effect. Not reassured by the knowledge that,
    on average, the analytical method will provide an unbiased
    estimate of intervention effect.
   ANCOVA the preferred analytical method both
    conditionally and unconditionally.
                                                              17
What about meta-analysis?
A small simulation study (1)
   Overview:
       Number of trials per meta-analysis randomly selected
        from U(3, 5).
       These were analysed using all SAFV, all SACS, all
        ANCOVA, or a random mix of the three analytical
        methods.
       Results pooled using the standard inverse variance fixed
        effect model.
       For each simulation scenario there were 50,000
        replicates.

                                                               18
A small simulation study (2)
   Number of participants per trial randomly selected
    from N(20, 9).
   Baseline and follow-up scores randomly sampled
    from a bivariate normal distribution for both the
    intervention and control groups (no intervention
    effect).    x       0   1  
                   ~ BVN   ,              
                  y
                  
                             0.1  
                                          1 
                                                
   Assuming
                y ,int   y ,ctrl   x2,int   x2,ctrl  1
                 2          2


   Seven correlations:     0, 0.05, 0.25, 0.45, 0.65, 0.85, 0.95.
                                                                 19
Distribution of pooled baseline imbalance (corr = 0.45)

                                 (a) SAFV                                         (b) SACS

       2




                                                             2
         1.5




                                                               1.5
 Density




                                                       Density
   1




                                                         1
       .5




                                                             .5
       0




                                                             0
                    -1     -.5       0      .5    1                  -1     -.5      0       .5   1


                           (c) ANCOVA                                     (d) Random selection




                                                             2
       2




                                                              1.5
       1.5
 Density




                                                       Density
                                                        1
  1




                                                             .5
       .5
       0




                                                             0

               -1        -.5        0       .5    1                  -1    -.5       0       .5   1

                                         Pooled baseline imbalance
Bias in the pooled estimate of intervention effect
 conditional on baseline imbalance (graph 1)

                    (a) Correlation = 0                      (b) Correlation = 0.45
       .5




                                                 .5
       0




                                                 0
       -.5




                                                 -.5
             -.5             0              .5         -.5             0              .5
Bias




                   (c) Correlation = 0.95
       .5
       0
       -.5




             -.5             0              .5

                                  Pooled baseline imbalance
Bias in the pooled estimate of intervention effect
       conditional on baseline imbalance (graph 2)

       .5          (a) Correlation = 0.05                    (b) Correlation = 0.25




                                                 .5
       0




                                                 0
       -.5




                                                 -.5
             -.5             0              .5         -.5             0              .5
Bias




                   (c) Correlation = 0.65                    (d) Correlation = 0.85
       .5




                                                 .5
       0




                                                 0
       -.5




                                                 -.5




             -.5             0              .5         -.5             0              .5

                                  Pooled baseline imbalance
A small simulation study:
Key points (1)
   Pooled baseline imbalance is not often a problem (for trials
    with adequate allocation concealment):
       Less likely to be a problem as the number of trials increases,
        or the sample size of the component trials increases, or both.
   When there is no, or minimal, baseline imbalance:
       Pooling using all SACS, all SAFV, all ANCOVA, or a random
        selection of analytical methods will produce an unbiased
        pooled estimate of intervention effect.
       ANCOVA more efficient than SAFV or SACS.
       Efficiency of pooling using all SAFV compared with all SACS
        dependent on the ‘overall’ correlation between baseline and
        follow-up.

                                                                    23
A small simulation study:
Key points (2)
   When there is pooled baseline imbalance:
       Pooling using all ANCOVA produces an unbiased pooled
        estimate of intervention effect. Guards against baseline
        imbalance.
       Pooling using all SAFV will produce a biased pooled
        estimate of intervention effect (except if the correlation
        tends to be close to 0).
       Pooling using all SACS will produce a biased pooled
        estimate of intervention effect (except if the correlation
        tends to be close to 1).
       Pooling using a random selection of analytical methods
        will produce a biased pooled estimate of intervention
        effect.
                                                               24
Meta-analysis in the real world

   In publications of trials:
       Generally only one type of analysis will be
        reported.
       Less frequently, an alternative analysis will be
        provided, or enough information to allow an
        alternative analysis to be performed.

   The analytical method used may (is likely
    to?) vary across trials.

                                                      25
Advice from the Handbook (1)
   Relevant sections: 9.4.5, 16.1.3.
   “The preferred statistical approach to accounting for
    baseline measurements of the outcome variable is to
    include the baseline outcome measurements as a covariate
    in a regression model or analysis of covariance (ANCOVA).
    These analyses produce an ‘adjusted’ estimate of the
    treatment effect together with its standard error. These
    analyses are the least frequently encountered, but as they
    give the most precise and least biased estimates of
    treatment effects they should be included in the analysis
    when they are available.” [Section 9.4.5.2]



                                                           26
Advice from the Handbook (2)
   “In practice an author is likely to discover that the studies included in a
    review may include a mixture of change-from-baseline and final
    value scores. However, mixing of outcomes is not a problem
    when it comes to meta-analysis of mean differences. There is no
    statistical reason why studies with change-from-baseline
    outcomes should not be combined in a meta-analysis with studies
    with final measurement outcomes when using the
    (unstandardized) mean difference method in RevMan. In a
    randomized trial, mean differences based on changes from
    baseline can usually be assumed to be addressing exactly the
    same underlying intervention effects as analyses based on final
    measurements. That is to say, the difference in mean final values
    will on average be the same as the difference in mean change
    scores. If the use of change scores does increase precision, the studies
    presenting change scores will appropriately be given higher weights in
    the analysis than they would have received if final values had been used,
    as they will have smaller standard deviations.” [Section 9.4.5.2]


                                                                            27
Concern regarding this advice
   The potential of within-study selective reporting
    bias has been defined as:
    “… selection, on the basis of the results, of a subset of
      analyses undertaken to be included in a study
      publication.” [Williamson 2005]
   As shown earlier in the Thailand trial and
    simulated data sets, the analytical methods will
    often produce different estimates of intervention
    effect.
       Selection of the most favourable estimate is likely to
        result in a biased pooled estimate of intervention effect
        and inflated type I error rates.

                                                                28
Meta-analysis options:
A proposed hierarchy of options
    Please see handout.
1.   Obtain individual patient data for each trial and
     reanalyse these data.
2.   Pool using only ANCOVA results. For each trial
     recreate the ANCOVA estimate from the
     summary statistics provided. This will generally
     involve imputing only the correlation.
3.   Pool using results from only one analytical
     method (SACS or SAFV).
4.   Pool using a mix of results from different
     analytical methods.
                                                    29
An example: Calcium
supplementation on body weight
   Trowman R, Dumville JC, Hahn S, Torgerson DJ.
    A systematic review of the effects of calcium
    supplementation on body weight. Br J Nutr
    2006;95(6):1033-8.
   Trowman R, Dumville JC, Torgerson DJ, Cranny
    G. The impact of trial baseline imbalances should
    be considered in systematic reviews: a
    methodological case study. J Clin Epidemiol
    2007;60(12):1229-33.


                                                  30
Example 2: Study characteristics (modified table 1)
                           (Trowman, 2006)


Study                  Number of      Age*   Sex                 Intervention          Length of   Country
                       participants                              (Ca concentration)    follow-up

Chee et al. (2003)         173        58·9   Female              Ca supplement (1200   24 months   Malaysia
                                             (postmenopausal)    mg/d)
Jensen et al. (2001)       52          NA    Female (obese       Ca supplement (1000   26 weeks    Denmark
                                             postmenopausal)     mg/d)
Lau et al. (2001)          185        57·0   Female              Ca supplement (800    24 months   China
                                             (postmenopausal)    mg/d)
Reid et al. (2002)         223        72·0   Female              Ca supplement (1000   24 months   New Zealand
                                             (postmenopausal)    mg/d)
Shapses et al.             36         59·3   Female (obese       Ca supplement (1000   25 weeks    USA
(2004)                                       postmenopausal)     mg/d)
Shapses et al.             30         56·0   Female (obese       Ca supplement (1000   25 weeks    USA
(2004)                                       postmenopausal)     mg/d)
Shapses et al.             42         41·0   Female (obese       Ca supplement (1000   25 weeks    USA
(2004)                                       postmenopausal)     mg/d)
Winters-Stone &            23         24·8   Female (athletes)   Ca supplement (1000   12 months   USA
Snow (2004)                                                      mg/d)
Zemel et al. (2004)        41          46    Mixed (obese)       Calcium supplement    24 weeks    USA
                                                                 (800 mg/d)


  NA, not available
  * Mean age. When age was reported separately by subgroups, the mean between the groups was calculated.
Calcium supplementation on body weight (Trowman, 2006)


Trial              Year                  Baseline (weight kg)                      Follow-up (weight kg)               Change (weight kg)

                                Intervention               Control                Intervention         Control       Intervention          Control

                               n     Mean (SD)        n          Mean (SD)         Mean (SD)        Mean (SD)         Mean (SD)        Mean (SD)

Chee               2003       91      56.1 (8.9)     82           57.2 (9.4)          56.1 (?)         57.4 (?)        0.0 (2.6) a       0.2 (2.6) a

Jensen             2001       25    94.6 (14.0)a     27         93.8 (14.0)a      89.0 (12.7)a     89.1 (14.7)a             -5.6 (?)       -4.7 (?)
Lau                2001       95      56.9 (7.1)     90           58.9 (7.5)          57.4 (?)         58.6 (?)        0.5 (2.6)a       -0.3 (2.7)a

Reid               2002      111     66.0 (10.0)    112          68.0 (11.0)          65.7 (?)         67.9 (?)        -0.3 (1.8)        -0.1 (2.4)

Shapses 1c         2004       17      84.1 (9.4)     19          89.4 (10.3)          77.1 (?)         82.1 (?)        -7.0 (4.6)        -7.3 (5.3)

Shapses 2c         2004       11      85.9 (9.2)     11          94.2 (15.7)          79.2 (?)         86.6 (?)        -6.7 (2.6)        -7.6 (5.7)

Shapses 3c         2004       18     93.7 (13.6)     24          93.5 (14.3)          87.0 (?)         89.2 (?)        -6.7 (5.5)        -4.3 (3.5)

Winters-Stone      2004       13      57.2 (4.9)     10           54.1 (7.2)        56.3 (4.3)       54.8 (7.2)             -0.9 (?)        0.7 (?)
Zemel              2004       11     99.8 (14.9)     10         103.1 (19.3)          91.2 (?)         96.5 (?)       -8.6 (5.3)a       -6.6 (8.2) a


a Calculated from the standard error
b Follow-up sample size ntrt = 24 and nctrl = 24
c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials.
Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
Meta-analysis of difference in means at baseline



                                                       Mean difference
Study                                                  (95% CI)        % Weight
 Chee                                                  -1.10 (-3.84, 1.64)    22.4
 Jensen                                                 0.80 (-6.82, 8.42)     2.9
 Lau                                                   -2.00 (-4.11, 0.11)    37.8
 Reid                                                  -2.00 (-4.76, 0.76)    22.1
 Shapses 1                                             -5.30 (-11.74, 1.14)    4.1
 Shapses 2                                             -8.30 (-19.05, 2.45)    1.5
 Shapses 3                                              0.20 (-8.30, 8.70)     2.3
 Winters-Stone                                          3.10 (-2.10, 8.30)     6.2
 Zemel                                                 -3.30 (-18.15, 11.55) 0.8
Overall                                                -1.58 (-2.88,-0.29) 100.0


                         -10   -5    0       5     10
                               Mean difference
          Favours Ca supplementation      Favours control
Calcium supplementation on body weight (Trowman, 2006)


Trial              Year                  Baseline (weight kg)                      Follow-up (weight kg)               Change (weight kg)

                                Intervention               Control                Intervention         Control       Intervention          Control

                               n     Mean (SD)        n          Mean (SD)         Mean (SD)        Mean (SD)         Mean (SD)        Mean (SD)

Chee               2003       91      56.1 (8.9)     82           57.2 (9.4)          56.1 (?)         57.4 (?)        0.0 (2.6) a       0.2 (2.6) a

Jensen             2001       25    94.6 (14.0)a     27         93.8 (14.0)a      89.0 (12.7)a     89.1 (14.7)a             -5.6 (?)       -4.7 (?)
Lau                2001       95      56.9 (7.1)     90           58.9 (7.5)          57.4 (?)         58.6 (?)        0.5 (2.6)a       -0.3 (2.7)a

Reid               2002      111     66.0 (10.0)    112          68.0 (11.0)          65.7 (?)         67.9 (?)        -0.3 (1.8)        -0.1 (2.4)

Shapses 1c         2004       17      84.1 (9.4)     19          89.4 (10.3)          77.1 (?)         82.1 (?)        -7.0 (4.6)        -7.3 (5.3)

Shapses 2c         2004       11      85.9 (9.2)     11          94.2 (15.7)          79.2 (?)         86.6 (?)        -6.7 (2.6)        -7.6 (5.7)

Shapses 3c         2004       18     93.7 (13.6)     24          93.5 (14.3)          87.0 (?)         89.2 (?)        -6.7 (5.5)        -4.3 (3.5)

Winters-Stone      2004       13      57.2 (4.9)     10           54.1 (7.2)        56.3 (4.3)       54.8 (7.2)             -0.9 (?)        0.7 (?)
Zemel              2004       11     99.8 (14.9)     10         103.1 (19.3)          91.2 (?)         96.5 (?)       -8.6 (5.3)a       -6.6 (8.2) a


a Calculated from the standard error
b Follow-up sample size ntrt = 24 and nctrl = 24
c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials.
Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
Calcium supplementation on body weight (Trowman, 2006)


Trial              Year                  Baseline (weight kg)                      Follow-up (weight kg)          Correlation between baseline
                                                                                                                          and follow-up
                                Intervention               Control                Intervention         Control       Intervention        Control

                               n     Mean (SD)        n          Mean (SD)         Mean (SD)        Mean (SD)        Correlation    Correlation

Chee               2003       91      56.1 (8.9)     82           57.2 (9.4)        56.1 (8.9)       57.4 (9.4)
Jensen             2001       25    94.6 (14.0)a     27         93.8 (14.0)a      89.0 (12.7)a     89.1 (14.7)a

Lau                2001       95      56.9 (7.1)     90           58.9 (7.5)        57.4 (7.1)       58.6 (7.5)
Reid               2002      111     66.0 (10.0)    112          68.0 (11.0)       65.7 (10.0)      67.9 (11.0)
Shapses 1c         2004       17      84.1 (9.4)     19          89.4 (10.3)        77.1 (9.4)      82.1 (10.3)
Shapses 2c         2004       11      85.9 (9.2)     11          94.2 (15.7)        79.2 (9.2)      86.6 (15.7)
Shapses 3c         2004       18     93.7 (13.6)     24          93.5 (14.3)       87.0 (13.6)      89.2 (14.3)
Winters-Stone      2004       13      57.2 (4.9)     10           54.1 (7.2)        56.3 (4.3)       54.8 (7.2)

Zemel              2004       11     99.8 (14.9)     10         103.1 (19.3)       91.2 (14.9)      96.5 (19.3)

a Calculated from the standard error
b Follow-up sample size ntrt = 24 and nctrl = 24
c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials.
Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
Calcium supplementation on body weight (Trowman, 2006)


Trial              Year                  Baseline (weight kg)                      Follow-up (weight kg)          Correlation between baseline
                                                                                                                          and follow-up
                                Intervention               Control                Intervention         Control       Intervention        Control

                               n     Mean (SD)        n          Mean (SD)         Mean (SD)        Mean (SD)        Correlation    Correlation

Chee               2003       91      56.1 (8.9)     82           57.2 (9.4)        56.1 (8.9)       57.4 (9.4)             0.957         0.962

Jensen             2001       25    94.6 (14.0)a     27         93.8 (14.0)a      89.0 (12.7)a     89.1 (14.7)a

Lau                2001       95      56.9 (7.1)     90           58.9 (7.5)        57.4 (7.1)       58.6 (7.5)             0.933         0.935

Reid               2002      111     66.0 (10.0)    112          68.0 (11.0)       65.7 (10.0)      67.9 (11.0)             0.984         0.976

Shapses 1c         2004       17      84.1 (9.4)     19          89.4 (10.3)        77.1 (9.4)      82.1 (10.3)             0.880         0.868

Shapses 2c         2004       11      85.9 (9.2)     11          94.2 (15.7)        79.2 (9.2)      86.6 (15.7)             0.960         0.934

Shapses 3c         2004       18     93.7 (13.6)     24          93.5 (14.3)       87.0 (13.6)      89.2 (14.3)             0.918         0.970

Winters-Stone      2004       13      57.2 (4.9)     10           54.1 (7.2)        56.3 (4.3)       54.8 (7.2)

Zemel              2004       11     99.8 (14.9)     10         103.1 (19.3)       91.2 (14.9)      96.5 (19.3)             0.937         0.910

a Calculated from the standard error
b Follow-up sample size ntrt = 24 and nctrl = 24
c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials.
Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
Combining intervention estimates from CS only

                                                       Mean difference
Study                                                  (95% CI)        % Weight
 Chee                                                  -0.20 (-0.98, 0.58)   22.5
 Jensen                                                -0.90 (-2.64, 0.84)    4.4
 Lau                                                   0.80 ( 0.04, 1.56)    23.1
 Reid                                                  -0.20 (-0.76, 0.36)   43.7
 Shapses 1                                             0.30 (-2.93, 3.53)     1.3
 Shapses 2                                             0.90 (-2.80, 4.60)     1.0
 Shapses 3                                             -2.40 (-5.30, 0.50)    1.6
 Winters-Stone                                         -1.60 (-4.19, 0.99)    2.0
 Zemel                                                 -2.00 (-7.97, 3.97)    0.4
Overall                                                -0.05 (-0.42, 0.31) 100.0


             -10      -5       0          5           10
                         Mean difference
    Favours Ca supplementation      Favours control
An example: Calcium
supplementation on body weight
   In this example, pooling SACS estimates is likely to
    produce a pooled estimate of intervention effect similar to
    pooling ANCOVA estimates.
   This occurs since even although there is baseline
    imbalance across the trials, the correlation between
    baseline weight and follow-up weight is likely to be large.
   What did the review authors do?
       Pooled using estimates of intervention effect calculated from
        FV. When there were missing SDs at follow-up, they assumed
        baseline SDs (imputed SDs in 7 of 9 trials).
       Also undertook a meta-regression adjusting for the baseline
        difference between groups.


                                                                  37
Some final thoughts (1)
   Should we pre-specify a primary analysis in the
    protocol of the review?
       For trials, it is common that we pre-specify the analysis
        in the trial protocol. This may include pre-specification
        of covariates which we will adjust for (e.g Senn 1989;
        Raab 2000).
OR ...
   Should we undertake multiple analyses and draw
    conclusions from an interpretation based on the
    results from these analyses?

                                                              38
Some final thoughts (2)
   Option 2 (recreate the ANCOVA estimates) has advantages
    in terms of bias and precision, but it will generally involve
    imputing correlation coefficients (and perhaps SDs).
   Perhaps we could pre-specify an approach such as the
    following in review protocols:
       Pool using results from only one analytical method (SAFV or
        SACS). In terms of bias, this will generally be reasonable. In
        terms of precision, this is not likely to provide an optimal
        solution.
         • The analytical method selected could be the one for which the summary
           statistics are most frequently reported. In the Trowman example, this
           would be CS.
       If there is large baseline imbalance across the trials, a ‘black
        belt’ approach of recreating the ANCOVA estimates using a
        range of correlations, in separate re-analyses, could be
        undertaken.
                                                                               39
References
   Overall JE, Magee KN. Directional baseline differences and type I error probabilities
    in randomized clinical trials.[comment]. Journal of Biopharmaceutical Statistics
    1992;2(2):189-203.
   Raab GM, Day S, Sales J. How to select covariates to include in the analysis of a
    clinical trial. Control Clin Trials 2000;21(4):330-42.
   Senn SJ. Covariate imbalance and random allocation in clinical trials. Statistics in
    Medicine 1989;8(4):467-75.
   Senn S. Design and interpretation of clinical trials as seen by a statistician.
    Chichester: Wiley & Sons, 1997.
   Trowman R, Dumville JC, Hahn S, Torgerson DJ. A systematic review of the effects
    of calcium supplementation on body weight. Br J Nutr 2006;95(6):1033-8.
   Trowman R, Dumville JC, Torgerson DJ, Cranny G. The impact of trial baseline
    imbalances should be considered in systematic reviews: a methodological case
    study. J Clin Epidemiol 2007;60(12):1229-33.
   Wei L, Zhang J. Analysis of data with imbalance in the baseline outcome variable for
    randomized clinical trials. Drug Information Journal 2001;35:1201-1214.
   Williamson PR, Gamble C, Altman DG, Hutton JL: Outcome selection bias in meta-
    analysis. Statistical Methods in Medical Research 2005, 14:515-524.



                                                                                     40

Más contenido relacionado

Similar a 2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie

20081206 Biostatistics
20081206 Biostatistics20081206 Biostatistics
20081206 BiostatisticsChung-Han Yang
 
Hypothesis Tests in R Programming
Hypothesis Tests in R ProgrammingHypothesis Tests in R Programming
Hypothesis Tests in R ProgrammingAtacan Garip
 
ELISA results interpretation [Autosaved].pptx
ELISA results interpretation [Autosaved].pptxELISA results interpretation [Autosaved].pptx
ELISA results interpretation [Autosaved].pptxAmirRaziq1
 
Ns Plus iFOBT WEO Final May 2011
Ns Plus iFOBT WEO Final May 2011Ns Plus iFOBT WEO Final May 2011
Ns Plus iFOBT WEO Final May 2011bpstat
 
Biological variation as an uncertainty component
Biological variation as an uncertainty componentBiological variation as an uncertainty component
Biological variation as an uncertainty componentGH Yeoh
 
The health effects of eating contaminated fish
The health effects of eating contaminated fishThe health effects of eating contaminated fish
The health effects of eating contaminated fishService_supportAssignment
 
10. sampling and hypotehsis
10. sampling and hypotehsis10. sampling and hypotehsis
10. sampling and hypotehsisKaran Kukreja
 
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmahypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmavdheerajk
 
Bioequivalence studies : A statistical approach through "R"
Bioequivalence  studies : A statistical approach through "R"Bioequivalence  studies : A statistical approach through "R"
Bioequivalence studies : A statistical approach through "R"Lavkush Upadhyay
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inferenceDev Pandey
 
Data Normalization Approaches for Large-scale Biological Studies
Data Normalization Approaches for Large-scale Biological StudiesData Normalization Approaches for Large-scale Biological Studies
Data Normalization Approaches for Large-scale Biological StudiesDmitry Grapov
 
Quality management system BY M .REJA
Quality management system BY M .REJAQuality management system BY M .REJA
Quality management system BY M .REJAMamunReja1
 
MGMT 305 CHAPTER 14a EXERCISES
MGMT 305 CHAPTER 14a EXERCISESMGMT 305 CHAPTER 14a EXERCISES
MGMT 305 CHAPTER 14a EXERCISESDleenBrowns
 
Statistical issues in subgroup analyses
Statistical issues in subgroup analysesStatistical issues in subgroup analyses
Statistical issues in subgroup analysesFrancois MAIGNEN
 
Determine the survival of ICU patients
Determine the survival of ICU patientsDetermine the survival of ICU patients
Determine the survival of ICU patientsHWHwang_
 

Similar a 2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie (20)

20081206 Biostatistics
20081206 Biostatistics20081206 Biostatistics
20081206 Biostatistics
 
Hypothesis Tests in R Programming
Hypothesis Tests in R ProgrammingHypothesis Tests in R Programming
Hypothesis Tests in R Programming
 
ELISA results interpretation [Autosaved].pptx
ELISA results interpretation [Autosaved].pptxELISA results interpretation [Autosaved].pptx
ELISA results interpretation [Autosaved].pptx
 
Ns Plus iFOBT WEO Final May 2011
Ns Plus iFOBT WEO Final May 2011Ns Plus iFOBT WEO Final May 2011
Ns Plus iFOBT WEO Final May 2011
 
Biological variation as an uncertainty component
Biological variation as an uncertainty componentBiological variation as an uncertainty component
Biological variation as an uncertainty component
 
The health effects of eating contaminated fish
The health effects of eating contaminated fishThe health effects of eating contaminated fish
The health effects of eating contaminated fish
 
Data analysis and working on spss
Data analysis and working on spssData analysis and working on spss
Data analysis and working on spss
 
10. sampling and hypotehsis
10. sampling and hypotehsis10. sampling and hypotehsis
10. sampling and hypotehsis
 
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmahypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigma
 
Bioequivalence studies : A statistical approach through "R"
Bioequivalence  studies : A statistical approach through "R"Bioequivalence  studies : A statistical approach through "R"
Bioequivalence studies : A statistical approach through "R"
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference
 
Resampling methods
Resampling methodsResampling methods
Resampling methods
 
Lund 2009
Lund 2009Lund 2009
Lund 2009
 
TESCO Evaluation of Non-Normal Meter Data
TESCO Evaluation of Non-Normal Meter DataTESCO Evaluation of Non-Normal Meter Data
TESCO Evaluation of Non-Normal Meter Data
 
Data Normalization Approaches for Large-scale Biological Studies
Data Normalization Approaches for Large-scale Biological StudiesData Normalization Approaches for Large-scale Biological Studies
Data Normalization Approaches for Large-scale Biological Studies
 
Quality management system BY M .REJA
Quality management system BY M .REJAQuality management system BY M .REJA
Quality management system BY M .REJA
 
MGMT 305 CHAPTER 14a EXERCISES
MGMT 305 CHAPTER 14a EXERCISESMGMT 305 CHAPTER 14a EXERCISES
MGMT 305 CHAPTER 14a EXERCISES
 
Statistical issues in subgroup analyses
Statistical issues in subgroup analysesStatistical issues in subgroup analyses
Statistical issues in subgroup analyses
 
Determine the survival of ICU patients
Determine the survival of ICU patientsDetermine the survival of ICU patients
Determine the survival of ICU patients
 
Stats chapter 15
Stats chapter 15Stats chapter 15
Stats chapter 15
 

Último

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 

Último (20)

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 

2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie

  • 1. Meta-analysis of continuous data: Final values, change scores, and ANCOVA Jo McKenzie School of Public Health and Preventive Medicine, Monash University, Australia joanne.mckenzie@med.monash.edu.au Cochrane Statistical Methods Training Course: Addressing advanced issues in meta-analytical techniques 4-5th March 2010, Cardiff, UK
  • 2. Outline of the presentation  Common questions about final values (FV), change scores (CS), and ANCOVA in meta-analysis.  Analysis of a randomised trial:  Example demonstrating how the analytical methods compare.  Properties of the estimators: • Conditional and unconditional on baseline imbalance, • Preferred estimator.  Meta-analysis:  A small simulation study: • Key points.  Advice from the Cochrane Handbook for Systematic Reviews of Interventions.  A proposed hierarchy of options.  An example.  Some final thoughts. 2
  • 3. Common questions: SMG list (October 2007) “In a Cochrane Workshop I attended <snip> we were told to use the post test scores and SDs of continuous variables. When we asked about what to do in case there is a significant difference in pretest scores, they stated that if proper randomization is used, pretest scores should be similar and therefore we should only concern ourselves with the posttest scores. However, we all know that is not always the case in smaller studies.” “I just spoke with <snip> and he strongly recommends using the change scores as this method better accounts for potential pretest differences.” “… What do you recommend to authors in your CRG: use of final value or use of change scores?” 3
  • 4. Common questions  There were 16 posts to the list following the initial post with the conclusion: “… no consensus exists in the SMG” 4
  • 5. Example comparing estimators  Randomised trial carried out in the Ubon Ratchathani province NE Thailand.  Aimed to test the efficacy of a seasoning powder fortified with micronutrients.  Groups:  Intervention: fortified seasoning powder added to instant wheat noodles or rice,  Control: unfortified seasoning powder added to instant wheat noodles or rice.  Data collected at baseline and follow-up (31 weeks).  Primary outcome was anaemia (defined from the continuous variable haemoglobin). 5
  • 6. Post intervention haemoglobin vs baseline haemoglobin 160 Group Control Intervention Post intervention haemoglobin (g/L) Mean control Mean intervention 100 12080 140 60 80 100 120 140 Baseline haemoglobin (g/L)
  • 7. Estimators  Three commonly used estimators are the final value (FV), change score (CS) and analysis of covariance (ANCOVA) estimator. ˆFV  yint  yctrl  CS  ( yint  yctrl )  ( xint  xctrl ) ˆ  ANCOVA  ( yint  yctrl )   ( xint  xctrl ) ˆ y where   x 7
  • 8. Observed and simulated data sets: summary statistics Dataset Observed Follow-up correlation Intervention group Control group Mean SD Mean SD Observed data 0.629 121.0 10.1 120.5 9.5 Simulated data 1 0.061 121.2 10.8 120.6 8.8 Simulated data 2 0.567 121.2 10.8 120.6 8.8 Simulated data 3 0.943 121.1 10.5 120.5 9.0 8
  • 9. Scatter plots of post intervention haemoglobin vs baseline haemoglobin for observed and simulated data sets (a) Observed data (corr = 0.629) (b) Simulated data 1 (corr = 0.061) 100 120 140 160 100 120 140 160 Post intervention haemoglobin (g/L) 80 80 60 80 100 120 140 60 80 100 120 140 (c) Simulated data 2 (corr = 0.567) (d) Simulated data 3 (corr = 0.943) 100 120 140 160 100 120 140 160 Group Control Intervention 80 80 60 80 100 120 140 60 80 100 120 140 Baseline haemoglobin (g/L)
  • 10. Estimated intervention effects (95% CIs) calculated using different analytical methods for the four data sets (a) Observed data (corr = 0.629) (b) Simulated data 1 (corr = 0.061) Estimated intervention effect on haemoglobin (g/L) p = 0.540 p = 0.001 p = 0.012 p = 0.456 p = 0.037 p = 0.379 4 4 2 2 0 0 -2 -2 SAFV SACS ANCOVA SAFV SACS ANCOVA (c) Simulated data 2 (corr = 0.567) (d) Simulated data 3 (corr = 0.943) p = 0.533 p = 0.003 p = 0.027 p = 0.513 p = 0.000 p = 0.000 4 4 2 2 0 0 -2 -2 SAFV SACS ANCOVA SAFV SACS ANCOVA Analytical method
  • 11. Some observations from comparing the analytical methods  Estimates of intervention effect:  For a particular data set, the three estimators can produce different estimates of intervention effect. • When the correlation is close to 0,  ANCOVA   FV . ˆ ˆ • When the correlation is close to 1,  ANCOVA   CS ˆ ˆ .  Over the data sets (varying correlation), the ANCOVA estimate varies, however, a simple analysis of either change scores (SACS) or final values (SAFV) does not.  Standard errors:  The standard error (SE) of the FV estimator remains the same over all data sets.  As the correlation increases, the SE of the CS estimator decreases. When the correlation < 0.5, the SE of CS estimator is > SE of the FV estimator. This is reversed when the correlation is > 0.5.  For a particular data set, the SE of the ANCOVA estimate is smaller compared to the SEs of FV and CS. 11
  • 12. Relationship between the estimators ˆANCOVA  ( yint  yctrl )   ( xint  xctrl ) Assuming  y  x : 2 2   When the correlation is close to 0,   0 ; ˆANCOVA  ˆFV  When the correlation is close to 1,   1 ; ˆANCOVA   CS ˆ  When there is minimal baseline imbalance; the three methods produce similar estimates since ( xint  xctrl )  0 . 12
  • 13. Variance of the estimators Assuming  y ,int   y ,ctrl   x ,int   x ,ctrl   2 2 2 2 2  2 2  FV 2  n 4 2  2 CS  1    n 2 2  ANCOVA 2  n  1  2  13
  • 14. Statistical properties of the estimators: unconditional inference  How do the estimators perform over hypothetical repetitions of randomised trials where baseline imbalance randomly varies?  Bias  SAFV, SACS, and ANCOVA are unbiased.  Type I error rates will be as designated.  Precision  ANCOVA is (generally) more efficient compared with SAFV, or a SACS. 14
  • 15. Statistical properties of the estimators: conditional inference  How do the estimators perform over hypothetical repetitions of randomised trials with the same baseline imbalance?  Bias  ANCOVA is conditionally unbiased.  SAFV is conditionally biased:  y   yint  yctrl | xint  xctrl        xint  xctrl    x   SACS is conditionally biased:  y   yint  yctrl   xint  xctrl  | xint  xctrl          1xint  xctrl    x  15
  • 16. Statistical properties of the estimators: conditional inference  Precision  ANCOVA is (generally) more efficient compared to SAFV, or a SACS. 16
  • 17. Unconditional or conditional inference?  Some debate over which type of statistical inference is important [Senn 1989; Overall 1992; Wei 2001].  “If I am cruising at 10,000m above sea level in mid- Atlantic and the captain informs me that three engines are on fire, I can hardly console myself with the thought that, on average, air travel is very safe” [Senn 1997].  Most trialists are concerned about the potential impact of any observed baseline imbalance on their estimate of intervention effect. Not reassured by the knowledge that, on average, the analytical method will provide an unbiased estimate of intervention effect.  ANCOVA the preferred analytical method both conditionally and unconditionally. 17
  • 18. What about meta-analysis? A small simulation study (1)  Overview:  Number of trials per meta-analysis randomly selected from U(3, 5).  These were analysed using all SAFV, all SACS, all ANCOVA, or a random mix of the three analytical methods.  Results pooled using the standard inverse variance fixed effect model.  For each simulation scenario there were 50,000 replicates. 18
  • 19. A small simulation study (2)  Number of participants per trial randomly selected from N(20, 9).  Baseline and follow-up scores randomly sampled from a bivariate normal distribution for both the intervention and control groups (no intervention effect). x 0   1     ~ BVN   ,    y     0.1      1    Assuming  y ,int   y ,ctrl   x2,int   x2,ctrl  1 2 2  Seven correlations: 0, 0.05, 0.25, 0.45, 0.65, 0.85, 0.95. 19
  • 20. Distribution of pooled baseline imbalance (corr = 0.45) (a) SAFV (b) SACS 2 2 1.5 1.5 Density Density 1 1 .5 .5 0 0 -1 -.5 0 .5 1 -1 -.5 0 .5 1 (c) ANCOVA (d) Random selection 2 2 1.5 1.5 Density Density 1 1 .5 .5 0 0 -1 -.5 0 .5 1 -1 -.5 0 .5 1 Pooled baseline imbalance
  • 21. Bias in the pooled estimate of intervention effect conditional on baseline imbalance (graph 1) (a) Correlation = 0 (b) Correlation = 0.45 .5 .5 0 0 -.5 -.5 -.5 0 .5 -.5 0 .5 Bias (c) Correlation = 0.95 .5 0 -.5 -.5 0 .5 Pooled baseline imbalance
  • 22. Bias in the pooled estimate of intervention effect conditional on baseline imbalance (graph 2) .5 (a) Correlation = 0.05 (b) Correlation = 0.25 .5 0 0 -.5 -.5 -.5 0 .5 -.5 0 .5 Bias (c) Correlation = 0.65 (d) Correlation = 0.85 .5 .5 0 0 -.5 -.5 -.5 0 .5 -.5 0 .5 Pooled baseline imbalance
  • 23. A small simulation study: Key points (1)  Pooled baseline imbalance is not often a problem (for trials with adequate allocation concealment):  Less likely to be a problem as the number of trials increases, or the sample size of the component trials increases, or both.  When there is no, or minimal, baseline imbalance:  Pooling using all SACS, all SAFV, all ANCOVA, or a random selection of analytical methods will produce an unbiased pooled estimate of intervention effect.  ANCOVA more efficient than SAFV or SACS.  Efficiency of pooling using all SAFV compared with all SACS dependent on the ‘overall’ correlation between baseline and follow-up. 23
  • 24. A small simulation study: Key points (2)  When there is pooled baseline imbalance:  Pooling using all ANCOVA produces an unbiased pooled estimate of intervention effect. Guards against baseline imbalance.  Pooling using all SAFV will produce a biased pooled estimate of intervention effect (except if the correlation tends to be close to 0).  Pooling using all SACS will produce a biased pooled estimate of intervention effect (except if the correlation tends to be close to 1).  Pooling using a random selection of analytical methods will produce a biased pooled estimate of intervention effect. 24
  • 25. Meta-analysis in the real world  In publications of trials:  Generally only one type of analysis will be reported.  Less frequently, an alternative analysis will be provided, or enough information to allow an alternative analysis to be performed.  The analytical method used may (is likely to?) vary across trials. 25
  • 26. Advice from the Handbook (1)  Relevant sections: 9.4.5, 16.1.3.  “The preferred statistical approach to accounting for baseline measurements of the outcome variable is to include the baseline outcome measurements as a covariate in a regression model or analysis of covariance (ANCOVA). These analyses produce an ‘adjusted’ estimate of the treatment effect together with its standard error. These analyses are the least frequently encountered, but as they give the most precise and least biased estimates of treatment effects they should be included in the analysis when they are available.” [Section 9.4.5.2] 26
  • 27. Advice from the Handbook (2)  “In practice an author is likely to discover that the studies included in a review may include a mixture of change-from-baseline and final value scores. However, mixing of outcomes is not a problem when it comes to meta-analysis of mean differences. There is no statistical reason why studies with change-from-baseline outcomes should not be combined in a meta-analysis with studies with final measurement outcomes when using the (unstandardized) mean difference method in RevMan. In a randomized trial, mean differences based on changes from baseline can usually be assumed to be addressing exactly the same underlying intervention effects as analyses based on final measurements. That is to say, the difference in mean final values will on average be the same as the difference in mean change scores. If the use of change scores does increase precision, the studies presenting change scores will appropriately be given higher weights in the analysis than they would have received if final values had been used, as they will have smaller standard deviations.” [Section 9.4.5.2] 27
  • 28. Concern regarding this advice  The potential of within-study selective reporting bias has been defined as: “… selection, on the basis of the results, of a subset of analyses undertaken to be included in a study publication.” [Williamson 2005]  As shown earlier in the Thailand trial and simulated data sets, the analytical methods will often produce different estimates of intervention effect.  Selection of the most favourable estimate is likely to result in a biased pooled estimate of intervention effect and inflated type I error rates. 28
  • 29. Meta-analysis options: A proposed hierarchy of options  Please see handout. 1. Obtain individual patient data for each trial and reanalyse these data. 2. Pool using only ANCOVA results. For each trial recreate the ANCOVA estimate from the summary statistics provided. This will generally involve imputing only the correlation. 3. Pool using results from only one analytical method (SACS or SAFV). 4. Pool using a mix of results from different analytical methods. 29
  • 30. An example: Calcium supplementation on body weight  Trowman R, Dumville JC, Hahn S, Torgerson DJ. A systematic review of the effects of calcium supplementation on body weight. Br J Nutr 2006;95(6):1033-8.  Trowman R, Dumville JC, Torgerson DJ, Cranny G. The impact of trial baseline imbalances should be considered in systematic reviews: a methodological case study. J Clin Epidemiol 2007;60(12):1229-33. 30
  • 31. Example 2: Study characteristics (modified table 1) (Trowman, 2006) Study Number of Age* Sex Intervention Length of Country participants (Ca concentration) follow-up Chee et al. (2003) 173 58·9 Female Ca supplement (1200 24 months Malaysia (postmenopausal) mg/d) Jensen et al. (2001) 52 NA Female (obese Ca supplement (1000 26 weeks Denmark postmenopausal) mg/d) Lau et al. (2001) 185 57·0 Female Ca supplement (800 24 months China (postmenopausal) mg/d) Reid et al. (2002) 223 72·0 Female Ca supplement (1000 24 months New Zealand (postmenopausal) mg/d) Shapses et al. 36 59·3 Female (obese Ca supplement (1000 25 weeks USA (2004) postmenopausal) mg/d) Shapses et al. 30 56·0 Female (obese Ca supplement (1000 25 weeks USA (2004) postmenopausal) mg/d) Shapses et al. 42 41·0 Female (obese Ca supplement (1000 25 weeks USA (2004) postmenopausal) mg/d) Winters-Stone & 23 24·8 Female (athletes) Ca supplement (1000 12 months USA Snow (2004) mg/d) Zemel et al. (2004) 41 46 Mixed (obese) Calcium supplement 24 weeks USA (800 mg/d) NA, not available * Mean age. When age was reported separately by subgroups, the mean between the groups was calculated.
  • 32. Calcium supplementation on body weight (Trowman, 2006) Trial Year Baseline (weight kg) Follow-up (weight kg) Change (weight kg) Intervention Control Intervention Control Intervention Control n Mean (SD) n Mean (SD) Mean (SD) Mean (SD) Mean (SD) Mean (SD) Chee 2003 91 56.1 (8.9) 82 57.2 (9.4) 56.1 (?) 57.4 (?) 0.0 (2.6) a 0.2 (2.6) a Jensen 2001 25 94.6 (14.0)a 27 93.8 (14.0)a 89.0 (12.7)a 89.1 (14.7)a -5.6 (?) -4.7 (?) Lau 2001 95 56.9 (7.1) 90 58.9 (7.5) 57.4 (?) 58.6 (?) 0.5 (2.6)a -0.3 (2.7)a Reid 2002 111 66.0 (10.0) 112 68.0 (11.0) 65.7 (?) 67.9 (?) -0.3 (1.8) -0.1 (2.4) Shapses 1c 2004 17 84.1 (9.4) 19 89.4 (10.3) 77.1 (?) 82.1 (?) -7.0 (4.6) -7.3 (5.3) Shapses 2c 2004 11 85.9 (9.2) 11 94.2 (15.7) 79.2 (?) 86.6 (?) -6.7 (2.6) -7.6 (5.7) Shapses 3c 2004 18 93.7 (13.6) 24 93.5 (14.3) 87.0 (?) 89.2 (?) -6.7 (5.5) -4.3 (3.5) Winters-Stone 2004 13 57.2 (4.9) 10 54.1 (7.2) 56.3 (4.3) 54.8 (7.2) -0.9 (?) 0.7 (?) Zemel 2004 11 99.8 (14.9) 10 103.1 (19.3) 91.2 (?) 96.5 (?) -8.6 (5.3)a -6.6 (8.2) a a Calculated from the standard error b Follow-up sample size ntrt = 24 and nctrl = 24 c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials. Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
  • 33. Meta-analysis of difference in means at baseline Mean difference Study (95% CI) % Weight Chee -1.10 (-3.84, 1.64) 22.4 Jensen 0.80 (-6.82, 8.42) 2.9 Lau -2.00 (-4.11, 0.11) 37.8 Reid -2.00 (-4.76, 0.76) 22.1 Shapses 1 -5.30 (-11.74, 1.14) 4.1 Shapses 2 -8.30 (-19.05, 2.45) 1.5 Shapses 3 0.20 (-8.30, 8.70) 2.3 Winters-Stone 3.10 (-2.10, 8.30) 6.2 Zemel -3.30 (-18.15, 11.55) 0.8 Overall -1.58 (-2.88,-0.29) 100.0 -10 -5 0 5 10 Mean difference Favours Ca supplementation Favours control
  • 34. Calcium supplementation on body weight (Trowman, 2006) Trial Year Baseline (weight kg) Follow-up (weight kg) Change (weight kg) Intervention Control Intervention Control Intervention Control n Mean (SD) n Mean (SD) Mean (SD) Mean (SD) Mean (SD) Mean (SD) Chee 2003 91 56.1 (8.9) 82 57.2 (9.4) 56.1 (?) 57.4 (?) 0.0 (2.6) a 0.2 (2.6) a Jensen 2001 25 94.6 (14.0)a 27 93.8 (14.0)a 89.0 (12.7)a 89.1 (14.7)a -5.6 (?) -4.7 (?) Lau 2001 95 56.9 (7.1) 90 58.9 (7.5) 57.4 (?) 58.6 (?) 0.5 (2.6)a -0.3 (2.7)a Reid 2002 111 66.0 (10.0) 112 68.0 (11.0) 65.7 (?) 67.9 (?) -0.3 (1.8) -0.1 (2.4) Shapses 1c 2004 17 84.1 (9.4) 19 89.4 (10.3) 77.1 (?) 82.1 (?) -7.0 (4.6) -7.3 (5.3) Shapses 2c 2004 11 85.9 (9.2) 11 94.2 (15.7) 79.2 (?) 86.6 (?) -6.7 (2.6) -7.6 (5.7) Shapses 3c 2004 18 93.7 (13.6) 24 93.5 (14.3) 87.0 (?) 89.2 (?) -6.7 (5.5) -4.3 (3.5) Winters-Stone 2004 13 57.2 (4.9) 10 54.1 (7.2) 56.3 (4.3) 54.8 (7.2) -0.9 (?) 0.7 (?) Zemel 2004 11 99.8 (14.9) 10 103.1 (19.3) 91.2 (?) 96.5 (?) -8.6 (5.3)a -6.6 (8.2) a a Calculated from the standard error b Follow-up sample size ntrt = 24 and nctrl = 24 c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials. Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
  • 35. Calcium supplementation on body weight (Trowman, 2006) Trial Year Baseline (weight kg) Follow-up (weight kg) Correlation between baseline and follow-up Intervention Control Intervention Control Intervention Control n Mean (SD) n Mean (SD) Mean (SD) Mean (SD) Correlation Correlation Chee 2003 91 56.1 (8.9) 82 57.2 (9.4) 56.1 (8.9) 57.4 (9.4) Jensen 2001 25 94.6 (14.0)a 27 93.8 (14.0)a 89.0 (12.7)a 89.1 (14.7)a Lau 2001 95 56.9 (7.1) 90 58.9 (7.5) 57.4 (7.1) 58.6 (7.5) Reid 2002 111 66.0 (10.0) 112 68.0 (11.0) 65.7 (10.0) 67.9 (11.0) Shapses 1c 2004 17 84.1 (9.4) 19 89.4 (10.3) 77.1 (9.4) 82.1 (10.3) Shapses 2c 2004 11 85.9 (9.2) 11 94.2 (15.7) 79.2 (9.2) 86.6 (15.7) Shapses 3c 2004 18 93.7 (13.6) 24 93.5 (14.3) 87.0 (13.6) 89.2 (14.3) Winters-Stone 2004 13 57.2 (4.9) 10 54.1 (7.2) 56.3 (4.3) 54.8 (7.2) Zemel 2004 11 99.8 (14.9) 10 103.1 (19.3) 91.2 (14.9) 96.5 (19.3) a Calculated from the standard error b Follow-up sample size ntrt = 24 and nctrl = 24 c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials. Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
  • 36. Calcium supplementation on body weight (Trowman, 2006) Trial Year Baseline (weight kg) Follow-up (weight kg) Correlation between baseline and follow-up Intervention Control Intervention Control Intervention Control n Mean (SD) n Mean (SD) Mean (SD) Mean (SD) Correlation Correlation Chee 2003 91 56.1 (8.9) 82 57.2 (9.4) 56.1 (8.9) 57.4 (9.4) 0.957 0.962 Jensen 2001 25 94.6 (14.0)a 27 93.8 (14.0)a 89.0 (12.7)a 89.1 (14.7)a Lau 2001 95 56.9 (7.1) 90 58.9 (7.5) 57.4 (7.1) 58.6 (7.5) 0.933 0.935 Reid 2002 111 66.0 (10.0) 112 68.0 (11.0) 65.7 (10.0) 67.9 (11.0) 0.984 0.976 Shapses 1c 2004 17 84.1 (9.4) 19 89.4 (10.3) 77.1 (9.4) 82.1 (10.3) 0.880 0.868 Shapses 2c 2004 11 85.9 (9.2) 11 94.2 (15.7) 79.2 (9.2) 86.6 (15.7) 0.960 0.934 Shapses 3c 2004 18 93.7 (13.6) 24 93.5 (14.3) 87.0 (13.6) 89.2 (14.3) 0.918 0.970 Winters-Stone 2004 13 57.2 (4.9) 10 54.1 (7.2) 56.3 (4.3) 54.8 (7.2) Zemel 2004 11 99.8 (14.9) 10 103.1 (19.3) 91.2 (14.9) 96.5 (19.3) 0.937 0.910 a Calculated from the standard error b Follow-up sample size ntrt = 24 and nctrl = 24 c Shapses et al (Shapses et al, 2004) report on three randomised controlled trials. Trials 1, 2, and 3 include postmenopausal women, postmenopausal women special diet, and premenopausal women respectively.
  • 37. Combining intervention estimates from CS only Mean difference Study (95% CI) % Weight Chee -0.20 (-0.98, 0.58) 22.5 Jensen -0.90 (-2.64, 0.84) 4.4 Lau 0.80 ( 0.04, 1.56) 23.1 Reid -0.20 (-0.76, 0.36) 43.7 Shapses 1 0.30 (-2.93, 3.53) 1.3 Shapses 2 0.90 (-2.80, 4.60) 1.0 Shapses 3 -2.40 (-5.30, 0.50) 1.6 Winters-Stone -1.60 (-4.19, 0.99) 2.0 Zemel -2.00 (-7.97, 3.97) 0.4 Overall -0.05 (-0.42, 0.31) 100.0 -10 -5 0 5 10 Mean difference Favours Ca supplementation Favours control
  • 38. An example: Calcium supplementation on body weight  In this example, pooling SACS estimates is likely to produce a pooled estimate of intervention effect similar to pooling ANCOVA estimates.  This occurs since even although there is baseline imbalance across the trials, the correlation between baseline weight and follow-up weight is likely to be large.  What did the review authors do?  Pooled using estimates of intervention effect calculated from FV. When there were missing SDs at follow-up, they assumed baseline SDs (imputed SDs in 7 of 9 trials).  Also undertook a meta-regression adjusting for the baseline difference between groups. 37
  • 39. Some final thoughts (1)  Should we pre-specify a primary analysis in the protocol of the review?  For trials, it is common that we pre-specify the analysis in the trial protocol. This may include pre-specification of covariates which we will adjust for (e.g Senn 1989; Raab 2000). OR ...  Should we undertake multiple analyses and draw conclusions from an interpretation based on the results from these analyses? 38
  • 40. Some final thoughts (2)  Option 2 (recreate the ANCOVA estimates) has advantages in terms of bias and precision, but it will generally involve imputing correlation coefficients (and perhaps SDs).  Perhaps we could pre-specify an approach such as the following in review protocols:  Pool using results from only one analytical method (SAFV or SACS). In terms of bias, this will generally be reasonable. In terms of precision, this is not likely to provide an optimal solution. • The analytical method selected could be the one for which the summary statistics are most frequently reported. In the Trowman example, this would be CS.  If there is large baseline imbalance across the trials, a ‘black belt’ approach of recreating the ANCOVA estimates using a range of correlations, in separate re-analyses, could be undertaken. 39
  • 41. References  Overall JE, Magee KN. Directional baseline differences and type I error probabilities in randomized clinical trials.[comment]. Journal of Biopharmaceutical Statistics 1992;2(2):189-203.  Raab GM, Day S, Sales J. How to select covariates to include in the analysis of a clinical trial. Control Clin Trials 2000;21(4):330-42.  Senn SJ. Covariate imbalance and random allocation in clinical trials. Statistics in Medicine 1989;8(4):467-75.  Senn S. Design and interpretation of clinical trials as seen by a statistician. Chichester: Wiley & Sons, 1997.  Trowman R, Dumville JC, Hahn S, Torgerson DJ. A systematic review of the effects of calcium supplementation on body weight. Br J Nutr 2006;95(6):1033-8.  Trowman R, Dumville JC, Torgerson DJ, Cranny G. The impact of trial baseline imbalances should be considered in systematic reviews: a methodological case study. J Clin Epidemiol 2007;60(12):1229-33.  Wei L, Zhang J. Analysis of data with imbalance in the baseline outcome variable for randomized clinical trials. Drug Information Journal 2001;35:1201-1214.  Williamson PR, Gamble C, Altman DG, Hutton JL: Outcome selection bias in meta- analysis. Statistical Methods in Medical Research 2005, 14:515-524. 40