2. Introduction
• Definitions
– Systematic review: is a systematic search for all
available studies to answer a particular question
– A systematic review seeks to identify, collate, and
systematically summarize all empirical evidence on a
specific research topic, using explicit, systematic,
transparent, and replicable methods that are
designed to minimize bias.
• Provides a comprehensive summary of best evidence
on a given research topic
3. Why do we need to do systematic
review
• There are limitless published studies
– On which study do you want to rely?
• Unmanageable information from individual study:
– Biased
– Methodologically flawed
– Time and context dependent
– Misinterpreted
– Vary in conclusion
• Need for comprehensive synthesis of existing
knowledge
4. • Such synthesis leads to
– Evidence based guidelines and policies
– Separate sound from unsound evidence
– Efficient use of resource
– Reflection on need of further research
5. • We benefited from SR
– Time
– Cost
– Right decision on what works
• recognizes that consistent results across multiple
primary studies provide more convincing evidence than
results from any single primary study.
– Future paths of investigation
– Science to grow
6. • Goals of systematic review:
– Appraise quality of existing studies
– Provide high quality summary of the underlying
evidence
7. • Users
– Researchers
• Current state of the art
• Gap in evidence
• Formulate further research
– Practitioners
• Inform best evidence based practice
• Develop guidelines
– Policy makers
• Formulate best evidence based policy
– Patients
• Make best informed choice from alternatives
8. • Before formulating the question, establish
need for a systematic review on the topic
9. Is there already an existing
systematic review on the topic?
No
Proceed with one if sufficient
evidence has emerged on the topic
Yes
No need
No
A new review is
warranted
Yes
Yes
No need
Is existing review sufficiently
comprehensive and exhaustive of the
topic in terms of databases searched,
any language restrictions or other
restrictions?
Is existing review still current or
up-to-date since it was
undertaken?
Is existing review methodologically sound and of high quality?
Have the various steps of conducting a review been adhered to
or are some key steps missing?
Yes
No need
No
A new review is
warranted
No
A new review is
warranted
10. Steps in conducting a systematic
review
– Question
– Develop protocol
– Identify eligible studies
– Appraise study quality
– Extract data
– Synthesize evidence
– Interpret findings
– Write up
11. • Four broad types of research questions are particularly
appropriate for systematic reviews:
– Questions about the etiology or epidemiology of particular
conditions;
– Questions about the efficacy or effectiveness of interventions
or treatments addressing those conditions; and
– Questions about group differences, either between naturally
occurring groups (i.e., males and females) or between groups
defined by researchers (e.g., between different diagnostic
groups).
– Other types of systematic review questions might focus on
• prevalence of conditions
• changes in conditions over time
• diagnostic test accuracy
• questions related to economic evidence
12. Formulation of research questions
• The PICOS/PECO/PEO framework
– P = Participant
– I/E = Intervention or Exposure
– C = Comparator
– O = Outcome
– S = Study designs
• Good guide to components of well built SR
question
• Enhances identification of relevant studies
13. • Participants
– In terms of disease/condition
– Important characteristics/demographics
• Age
• Sex
• Ethnicity
– Setting
• Facility
• Community
• Developed/Low and middle income countries
15. • Comparison
– The alternative being compared to the
intervention
– Should be specific and limited to one or two
alternatives
16. • Outcome
– Dichotomous or continuous
– Mortality, morbidity, quality of life, effectiveness,
efficiency, efficacy, etc.
• Example:
– ‘‘Is [intervention X1] more effective than
[intervention X2] in addressing [Y outcomes] in [Z
population or context]?’’
17. • Study design
– RCTs
– Cohort
– Case control
– Cross sectional
– Correlational
– Case series
– Case report
20. Features of a Good SR question
• Relevance:
– Policy
– Practice
– Care
– Research
• Feasibility
• Ethically sound
• Novelty
• Narrowness/broadness
22. Developing a Protocol
• After you identified your area of interest
– Refine a topic
– Articulate central questions or hypotheses
– Affix the scope of a review
• Develop a specific inclusion and exclusion criteria
– to clarify central concepts and set the boundaries of a review.
23. Identifying eligible studies
• Get information from the papers
Data bases Grey Literature
• The JBI Library
• The Cochran Library
• Medline
• Embase
• CINAHL
• Mednar
• WorldWideScience
• PsycEXTRA
• Google Scholar
24. SEARCH STRATEGY
• Searching relevant literature
– Identify key words from the identified research
topic
– Use of Boolean operators
• And
• Or
• Not
25. Additional sources
• Reference list of retrieved articles
• Manual searching of relevant publications
• Experts in the field
• Correspondence with first authors of published
studies identified for the systematic review.
26. Search Logistics
Search Strategy
Search Sources
Export citations
to bibliographic
software
• Apply Search Strategy to
databases
• Export to bibliographic
software
– E.g. Endnote
• Document the process
27. Select Studies
• Selection of studies is an initial assessment
• It addresses the question “should the paper
be retrieved?”
• Aims to select only those studies that address
the review question
– Match with the inclusion criteria
• Scan titles and abstracts
• If uncertain? - Retrieve - scan full text
• The selection should be:
– Transparent
– Reproducible
28. Inclusion/exclusion criteria
• Reviewers must determine which studies are
relevant and suitable for inclusion in the
systematic review.
• The criteria for study selection should be
determined ahead of time to the extent
possible,
– Reduce selection
• The specific inclusion criteria will depend on
the study question
– PICOS
29. • Studies will be included, if they fulfill the following
eligibility criteria:
– Study designs:
• Observational analytic
– Study settings:
• All countries
– Participants:
• Men, women & children
– Exposures:
– Outcome measures:
– Publication Status:
• All published data, abstracts and grey literature
– Timings:
• All dates
– Language of Published Articles:
• English language
30. Exclusion Criteria
• Keep log of excluded studies
• Note reasons for exclusion
• Have eligibility checked by more than one
reviewer
• Develop strategy to resolve disagreements
31. Appraise study quality
QUORUM for trails
Moher et al. Randomized controlled trials: The QUORUM
statement. Lancet 1999; 354:1896-1900
MOOSE for observational designs
Meta-analysis of observational studies in epidemiology. JAMA
2000;283:2008-12
Newcastle-Ottawa quality assessment scale (NOS)
A quality score was calculated based on three major
components:
1. selection of the groups of study
2. comparability
3. ascertainment of the exposure and outcome.
32. NOS score
• 7 or more ( a maximum of 7 for cross-sectional
studies)
• 9 for case-control and Cohort studies chosen to
indicate a high standard for comparative
observational studies.
Review with the participants (NOS tool)
• JBI quality appraisal tools for all study design
33. 521 references
317 references
Scanned Ti/Ab
204 duplicates
38 studies
retrieved
279 do not meet
Inclusion criteria
11 do not meet
Inclusion criteria
27 studies for
Critical Appraisal
34. Extract data
• Design and pilot data abstraction form
– Consider more than one reviewers
35. Data abstraction elements
• Publication details
• Study design
• Population details (n, characteristics)
• Intervention details
• Setting
• Outcomes and findings
Sample Data abstraction format is available
http://www.cochrane.org
36. Sample data abstraction form
Author &
year of
publication
Country Objective
of the
study
Study
design
Participants Main
findings
37. • May require manipulating data into common
format
– Calculating estimates not given in study e.g. CI, SE
– Contact authors to request info not given in the
study
• Compare extraction between reviewers
39. Synthesize evidence
• Collation and summary of findings to drive
reliable conclusion
• Consider the strengths of overall evidence
– Consistency of results
– Possible reasons for any inconsistencies
• Can be:
– Narrative-only synthesis
• If we can’t combine the body of evidence statistically
– Quantitative summary-Mata analysis
• If we can combine the body of evidence statistically
40. Qualitative systematic
review (best evidence
synthesis)
The results of primary
studies are
summarized
Not statistically
combined
Described narratively
Still use other
methods to limit bias
Quantitative systematic
review (meta-analysis)
The results of two or
more primary studies
are combined
Statistically combined
• Individual patient data
(IPD)/Pooled analysis
• Aggregate patient data
(APD)
Use methods to limit
bias
43. How do we decide whether to MA?
• It depends on:
– Heterogeneity
• Clinical
• Methodological
– Homogeneity
• Clinical
• Methodological
44. Meta analysis
• Meta analysis is a statistical analysis that combines the
results of multiple scientific studies judged to be
combinable.
• Meta-analyses can be performed when there are multiple
scientific studies addressing the same question, with each
individual study reporting measurements that are expected
to have some degree of error.
• The aim then is to use approaches from statistics to derive
a pooled estimate closest to the unknown common truth
based on how this error is perceived.
• Meta-analytic results are considered the most trustworthy
source of evidence to evidence based practice.
45. Purpose of Meta analysis
• Uses statistics to summarize results of
independent studies
– Provide more precise estimates of the effects of
health care studies than the individual studies
– Facilitates investigation of the consistency and
differences across studies
46. • The goal of a synthesis is to understand the
results of any study in the context of all other
studies.
• First, we need to know whether or not the effect
size is consistent across the body of data.
• The effect size is a value which reflects the
magnitude of the treatment effect or the strength
of a relationship between two variables.
47. • It is the unit of currency in a meta-analysis.
• It is used as the dependent (outcome) variable
in a meta-analysis.
• We compute the effect size for each study,
and then work with the effect sizes to assess
the consistency of the effect across studies
and to compute a summary effect.
49. • The effect size could represent the impact of an
intervention, such as
– the impact of medical treatment on risk of infection,
– the impact of a teaching method on test scores,
• The effect size is not limited to the impact of
interventions, but could represent any
relationship between two variables, such as
– the difference in test scores for males versus females,
– the difference in cancer rates for persons exposed or
not exposed to second-hand smoke, or
– the difference in cardiac events for persons with two
distinct personality types.
• In fact, what we generally call an effect size could
refer simply to the estimate of a single value.
50. • Common Effect size metrics:
– Odds ratio
– Relative risk ratio
– Risk difference
– Correlation coefficient
– Hazard ratio
– Proportion
– Measure of central tendency (mean, median)
– Unstandardized mean difference
– Standardized mean difference
51. How to choose an effect size
• There are three major considerations:
– The effect sizes from the different studies should be
comparable to one another in the sense that they
measure (at least approximately) the same thing
– Estimates of the effect size should be computable
from the information that is likely to be reported in
published research reports.
– The effect size should have good technical
properties
• Sampling distribution should be known so that variances
and confidence intervals can be computed
52. Fixed Effect and Random Effect Models
• Most meta-analyses are based on one of two
statistical models,
– Fixed-effect model or
– Random-effects model
54. Fixed Effect Model
• Under the fixed-effect model we assume that:
– All studies in the meta-analysis share a common
(true) effect size.
OR
– All factors that could influence the effect size are the
same in all the studies, and therefore the true effect
size is the same (hence the label fixed) in all the
studies.
• Since all studies share the same true effect, it
follows that the observed effect size varies from
one study to the next only because of the
random error inherent in each study.
55. PERFORMING A FIXED-EFFECT META-
ANALYSIS
• In an actual meta-analysis we start with the
observed effects and trying to estimate the
population effect.
• In order to obtain the most precise estimate
of the population effect (to minimize the
variance) we compute a weighted mean,
where the weight assigned to each study is
the inverse of that study’s variance.
56. • Concretely, the weight assigned to each study in a
fixed-effect meta-analysis is
Wi =
1
𝑉𝑌𝑖
, where, VYi is the within-study variance for
study (i)
• The weighted mean (M) is then computed as
• that is, the sum of the products WiYi (weight
multiplied by effect size) divided by the sum of
the weights
57. • The variance of the summary effect is
estimated as the reciprocal of the sum of the
weights, or
• And the estimated standard error of the
summary effect is then the square root of the
variance,
58. • Then, 95% lower and upper limits for the
summary effect are estimated as:
LLM = M – 1.96 * SEM
and
ULM = M + 1.96 * SEM
• Finally, a Z-value to test the null hypothesis
that the common true effect is zero can be
computed using
59. Random Effects Model
• Under the random-effects model we allow
that the true effect could vary from study to
study (randomly distributed).
• For example, the effect size might be higher
(or lower) in studies where the participants
are older, or more educated, or healthier
than in others, or when a more intensive
variant of an intervention is used, and so on.
60. PERFORMING A RANDOM-EFFECTS
META-ANALYSIS
• Our goal is to use the collection of Yi to
estimate the overall mean, μ.
• In order to obtain the most precise estimate
of the overall mean (to minimize the variance)
we compute a weighted mean, where the
weight assigned to each study is the inverse of
that study’s variance.
61. • More generally, the observed effect Yi for any
study is given by the grand mean, the deviation of
the study’s true effect from the grand mean, and
the deviation of the study’s observed effect from
the study’s true effect.
• To compute a study’s variance under the random-
effects model, we need to know both the within-
study variance and T2 , since the study’s total
variance is the sum of these two values.
62. • The parameter T2 (tau-squared) is the
between-studies variance (the variance of the
effect size parameters across the population
of studies).
• In other words, if we somehow knew the true
effect size for each study, and computed the
variance of these effect sizes (across an
infinite number of studies), this variance
would be T2.
63. • T2 is estimated by using the moments (or the
DerSimonian and Laird) method, as follows:
Where, and
Q statistic (a measure of weighted squared
deviations)
64. Estimating the mean effect size
• Under the random-effects model the weight
assigned to each study is
• where VYi* is the within-study variance for
study i plus the between-studies variance, T2 .
That is,
65. • The weighted mean, M*, is then computed as
• That is, the sum of the products (weight
multiplied by effect size) divided by the sum of
the weights.
66. • The variance of the summary effect is
estimated as the reciprocal of the sum of the
weights, or
• and the estimated standard error of the
summary effect is then the square root of the
variance,
67. • The 95% lower and upper limits for the
summary effect would be computed as
LLM* = M* - 1:96 x SEM*
and
ULM* = M* + 1:96 x SEM*
• Finally, a Z-value to test the null hypothesis
that the mean effect μ is zero could be
computed using
68. WHICH MODEL SHOULD WE USE?
• Fixed effect
• We use the fixed-effect model if two
conditions are met
– First, we believe that all the studies included in
the analysis are functionally identical.
– Second, our goal is to compute the common effect
size for the identified population, and not to
generalize to other populations.
69. • Random effects
– If the goal of analysis is usually to generalize to a
range of scenarios.
– If the data is gathered from the published
literature, the random effects model is generally a
more plausible match
70. Tests for heterogeneity
• The usual way of assessing whether there is true
heterogeneity in a meta - analysis is the Q-test and I-square
• The classical measure of heterogeneity is Cochran’s Q,
which is calculated as the weighted sum of squared
differences between individual study effects and the pooled
effect across studies, with the weights being those used in
the pooling method.
• The classical measure of heterogeneity is Cochran’s Q,
which is calculated as the weighted sum of squared
differences between individual study effects and the pooled
effect across studies, with the weights being those used in
the pooling method.
71. • I-square …… very low-------fixed model
– The I² statistic describes the percentage of
variation across studies that is due to
heterogeneity rather than chance.
– I² = 100% x (Q-df)/Q
– I² is an intuitive and simple expression of the
inconsistency of studies’ results.
• The purpose of this test was to assess the
extent of variation between the sample
estimates.
72. Tests for publication bias
• Egger’s test is commonly used to assess potential
publication bias in a meta-analysis via funnel plot
asymmetry
• It is test is a linear regression of the intervention
effect estimates on their standard errors
weighted by their inverse variance.
• A funnel plot is a scatter plot of treatment effect
against a measure of study precision.
• It is used primarily as a visual aid for detecting
bias or systematic heterogeneity.
• A symmetric inverted funnel shape arises from a
‘well-behaved’ data set, in which publication bias
is unlikely.