Cluster Randomization Trials Explained: Design, Analysis, Sample Size

Cluster Randomization Trials
Dr. Ranadip Chowdhury.
M.B.B.S., M.D.
M.I.P.H.A.

What Are Cluster Randomization Trials
Cluster randomization trials are experiments in
which intact social units or clusters of
individuals rather than independent individuals
are randomly allocated to intervention groups.

Examples:
• Medical practices selected as the
randomization unit.
• Communities selected as the randomization
unit.
• Hospitals selected as the randomization unit
in trials.

Reasons for Adopting Cluster Randomization
• Intervention naturally applied at the
cluster level
• Administrative convenience
• To avoid treatment group contamination
• To obtain cooperation of investigators
• To enhance subject compliance

Challenges of CRTs
• Unit of Randomization vs. Unit of Analysis.
• Low power and a relatively high probability of
chance imbalance b/w intervention arms.
• Post randomization recruitment bias

Design
• 2 main approaches to randomization:
Unrestricted allocation
Restricted allocation
Matching
Stratification
Minimization
Covariate-constrained randomization

Choosing an allocation technique

Adv V/S Limitation of allocation techniques
Technique Advantages Limitations
Simple randomization No need for baseline data Higher risk for imbalance
Matching •Improves Face validity
•Balance effectively for
covariates.
•Lost to follow-up is doubled
•Challenges with analysis
•Difficult to estimate ICC
•Reduced degrees of freedom
limits power.
Stratification May be used in combination
with other allocation
techniques.
Can balance for covariates on
its own.
Minimization Can balance effectively for
many covariates.
•Continuous covariates may
need to be split into categories.
•Potential for selection bias.
Covariate-constrained
randomization
•Balances most effectively for
many covariates.
•Limits risk of selection bias.
•Access to baseline data.
•Additional statistical support.
•Allocation must occur after
recruitment.

Cohort versus cross-sectional designs
• Possible instability in cohorts of large size, with the
resulting likelihood of subject loss to follow-up.
• Representativeness of the target population, which is
invariably hampered by the ageing of the cohort over
time
If the primary questions of interest focus on
change at the community level rather than at the
level of the individual, cohort samples are the less
natural choice.

Methodological Considerations in CRT
• Observations on participants in the same cluster
tend to be correlated (non-independent).
• Degree of correlation within clusters is known as
intracluster correlation coefficient (ρ).
• Intracluster correlation coefficient is the
proportion of the total variance of the outcome
that can be explained by the variation between
clusters.

Sample size
• 2 important components of variation:
• Within cluster (Intracluster correlation coefficient)
• Between cluster
(A useful rule of thumb is that the power does not increase
appreciably once the number subjects per cluster exceeds 1/ ρ)
• No simple relation exist between k and ρ for
continuous outcomes but a relation exists for binary
outcome.
• For the same statistical power the overall sample size
needs to be larger in CRT than in an individually
randomized trial.

Standard sample size formulae for CRT
• where nI is the required sample size
per arm using a trial with individual
randomization to detect a difference
d, and VIF(Design Effect) can be
modified to allow for variation in
cluster sizes. This is the standard
result, that the required sample size
for a CRCT is that required under
individual randomisation, inflated
by the variance inflation factor.
• The trial will randomize the
intervention over k clusters per arm
each of size m, to provide a total of
nc = mk individuals per arm.

• The number of clusters
required per arm :
assuming equal cluster
sizes.

CRTs with a fixed number of clusters: sample size per cluster
• For a trial with a fixed
number of equal sized
clusters (k) the required
sample size per arm for
a trial with pre-specified
power 1 - b, to detect a
difference of d, is nc.
• Where nI is the sample
size required under
individual
randomisation.

• The corresponding number
of individuals in each of the
k equally sized clusters.

CRTs with a fixed number of clusters: a practical advice
• Determine the required number of individuals per arm
in a trial using individual randomisation (nI).
• Determine whether a sufficient number of clusters are
available. For equal sized clusters, this will occur
when: k > nIρ
• Where the design is still not feasible
• Either: the power must be reset at a value lower than the maximum
available power
• the detectable difference must be set greater than the minimum
detectable difference
• both power and detectable difference are adjusted in combination.

Statistical model for intracluster correlation
• where yik is the value of the
response variable for unit i
in cluster k, and is the
overall mean. The
remaining two terms
represent the two levels of
variation in the data, with ik
representing the “within-
cluster variation between
observations from the same
cluster, and bk the “between
cluster” variation.

Analysis
• Reducing clusters to independent observation
or summary statistics.
• Fixed effect regression/ ANOVA
• Methods that explicitly account for clustering

SUMMARY STATISTICS:
– Un-weighted method of analysis in unequal
numbers of observations per cluster.
– Taking the average of the observation in each
cluster, information regarding the individual
observations is lost.

Fixed effects regression/ANOVA approaches
– If a fixed effect is used, then the results of the
analysis are strictly only applicable to the
particular set of clusters in the study.
– If the data are normal or can be transformed to
normality, then a normal regression (ANOVA)
approach with a fixed effect for cluster and an
effect for group can be used.

• Methods that explicitly account for clustering:
– Methods that adjust existing tests to account
clustering
• Depends on data distribution
– Modeling approaches
• Linear Mixed model (LMM)
• Generalized Linear Mixed model (GEE)

Cluster Specific (CS) Model
• The clusters are sampled from a
larger population and the effect of
any particular cluster i is to add a
random effect Zi to all the outcomes.
For a cluster randomized we could set
X =1 for intervention and X =0 for
control. A CS model measures the
effect on Y of changing X, while Z is
held constant. This is a common
model for longitudinal data, where it
is possible to imagine, say in a cross-
over trial, a treatment value changing
over time. However, in a cluster
randomized, everyone in a cluster
receives the same treatment, and
although a CS model can be fitted,
the result can be interpreted
theoretically.

Marginal Model
• Fitting this model is
equivalent to fitting a
Marginal model, that is
we estimate the effect
of X on Y as averaged
over all the clusters Z.

• CS models would seem to be most suitable for testing
effect of cluster level covariates, while Marginal
models are conceptually preferable for estimating the
effect of cluster level covariate.
• Difference between two approaches disappear as the
ICC approaches zero.
• CS provides direct estimates of variance components
while those are treated as nuisance parameters when
the population average approached is adopted.

Pitfalls and Controversies
• Ethical Issues
• Unit of reference
• Over matching
• Sample size and study power
• Assessing value of ICC from small studies.

Cluster Randomization Trials Explained: Design, Analysis, Sample Size

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (14)

Similar a Cluster Randomization Trials Explained: Design, Analysis, Sample Size

Similar a Cluster Randomization Trials Explained: Design, Analysis, Sample Size (20)

Último

Último (20)

Cluster Randomization Trials Explained: Design, Analysis, Sample Size