We discuss a general roadmap for generating causal inference based on observational studies used to general real world evidence. We review targeted minimum loss estimation (TMLE), which provides a general template for the construction of asymptotically efficient plug-in estimators of a target estimand for realistic (i.e, infinite dimensional) statistical models. TMLE is a two stage procedure that first involves using ensemble machine learning termed super-learning to estimate the relevant stochastic relations between the treatment, censoring, covariates and outcome of interest. The super-learner allows one to fully utilize all the advances in machine learning (in addition to more conventional parametric model based estimators) to build a single most powerful ensemble machine learning algorithm. We present Highly Adaptive Lasso as an important machine learning algorithm to include.
In the second step, the TMLE involves maximizing a parametric likelihood along a so-called least favorable parametric model through the super-learner fit of the relevant stochastic relations in the observed data. This second step bridges the state of the art in machine learning to estimators of target estimands for which statistical inference is available (i.e, confidence intervals, p-values etc). We also review recent advances in collaborative TMLE in which the fit of the treatment and censoring mechanism is tailored w.r.t. performance of TMLE. We also discuss asymptotically valid bootstrap based inference. Simulations and data analyses are provided as demonstrations.
Similar a Causal Inference Opening Workshop - Targeted Learning for Causal Inference Based on Real World Data - Mark van der Laan, December 9, 2019 (20)
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Based on Real World Data - Mark van der Laan, December 9, 2019
1. Targeted Learning of Causal Impacts Based on Real
World Data
Mark van der Laan
Division of Biostatistics, UC Berkeley
December 9, 2019
SAMSI Workshop on Causal Inference, Durham
Joint work with Wilson Cai, David Benkeser, Maya Petersen
This research was supported by NIH grant R01 AI074345-09.
2. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
3.
4. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
5. Highly Adaptive Lasso MLE (HAL-MLE)
• This is a maximum likelihood/minimum loss estimator that estimates
functionals (e.g outcome regression and propensity score) by
approximating them with linear model in many (≤ n2d ) tensor
product indicator basis functions, constraining the L1-norm of the
coefficient vector, and choosing it with cross-validation (vdL, 2015,
Benkeser, vdL, 2017)
• Guaranteed to converge to truth at rate n−1/3(log n)d in sample size
n (Bibaut, vdL, 2019): only assumption that true function is
right-continuous, left-hand limits, and has finite sectional variation
norm.
• When used in super-learner library (or by itself), TMLE (targeted
learning) is guaranteed consistent, (double robust) asymptotically
normal and efficient: one only needs to assume strong positivity
assumption.
• Undersmoothed HAL-MLE is efficient for smooth functionals.
6. Example: HAL-MLE of conditional hazard
• Suppose that O = (W , A, ˜T = min(T, C), ∆ = I(T ≤ C)), and that
we are interested in estimating the conditional hazard λ(t | A, W ).
• Let L(λ) be the log-likelihood loss.
• If T is continuous, we could parametrize
λ(t | A, W ) = exp(ψ(t, A, W )), or, if T is discrete,
Logitλ(t | A, W ) = ψ(t, A, W ).
• We can represent ψ = s⊂{1,...,d} βs,jφus,j as linear combination of
indicator basis functions, where L1-norm of β represents the sectional
variation norm of ψ.
• Therefore, we can compute the HAL-MLE of λ with either Cox-Lasso
or logistic Lasso regression (glmnet()).
7. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
8. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
10. TMLE
Let D∗(P) be canonical gradient/efficient influence curve of target
parameter Ψ : M → IR at P ∈ M.
Initial Estimator: P0
n initial estimator of P0. We recommend
super-learning.
Targeting of initial estimator: Construct so called least favorable
parametric submodel {P0
n, : } ⊂ M through P0
n so that
d
d L(P0
n, )
=0
spans the canonical gradient D∗(P0
n ) at P0
n ,
where (e.g.) L(P)(O) = − log p(O) is log-likelihood loss. Let
n = arg min
i
L(P0
n, )(Oi )
be the MLE, and P∗
n = P0
n, n
.
TMLE of ψ0: The TMLE of ψ0 is plug-in estimator Ψ(P∗
n ).
Solves optimal estimating equation: PnD∗(P∗
n ) ≡ 1
n i D∗(P∗
n )(Oi ) ≈ 0.
11. Local least favorable submodel
Let O ∼ P0 ∈ M. Let Ψ : M → IR be a one-dimensional target
parameter, and let D∗(P) be its canonical gradient at P. A 1-d local least
favorable submodel {plfm : } satisfies
d
d
log plfm
)
=0
= D∗
(P).
Equivalently, the score of an LFM maximizes the Cramer-Rao lower bound
over all 1-d parametric submodels {P : } through P:
CR(h | P) = lim
→0
(Ψ(P ,h) − Ψ(P))2
−2P log dP ,h/dP
.
That is, an LFM has a local behavior that maximizes the square change in
target parameter per unit increase in information/likelihood.
12. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
13. Universal least favorable submodel
We define a 1-d universal least favorable submodel at P as a submodel
{P : } so that for all
d
d
log
dP
dP
= D∗
(P ). (1)
This acts as a local least favorable submodel at any point on its path.
14. TMLE based on ULFM is a one-step TMLE
Let P0
n be an initial estimator of P0. Suppose that, given a P ∈ M, we
can construct a universal least favorable parametric model
{Pulfm : ∈ (−a, a)} ⊂ M. Let
0
n = arg max Pn log
dP0
n,
dP0
n
.
Let P1
n = P0
n, 0
n
. Since 0
n is a local maximum, P1
n solves its score equation,
given by PnD∗(P1
n ) = 0. That is, the TMLE is given by Ψ(P1
n ).
15. Universal least favorable submodel for 1-d target parameter
For ≥ 0, we recursively define
p = p exp
0
D∗
(Px )dx , (2)
and, for < 0, we recursively define
p = p exp −
0
D∗
(Px )dx .
16. Universal LFM in terms of local LFM
One can also define it in terms of a given local LFM plfm: for > 0 and
d > 0, we have
p +d = plfm
,d .
That is, p +d equals the local LFM {plfm
δ : δ} through p = p at local
value δ = d . Similarly, we define it for < 0.
17. A universal canonical submodel that targets a
multidimensional target parameter
Let Ψ(P) = (Ψ(P)(t) : t) be multidimensional (e.g., infinite dimensional).
Let D∗(P) = (D∗
t (P) : t) be the vector-valued efficient influence curve.
Consider the following recursively defined submodel: for ≥ 0, we define
p = pΠ[0, ] 1 +
{PnD∗(Px )} D∗(Px )
D∗(Px )
dx
= p exp
0
{PnD∗(Px )} D∗(Px )
D∗(Px )
dx . (3)
18. Score is Euclidean norm of empirical mean of vector
efficient influence curve
Theorem:
We have {p : ≥ 0} is a family of probability densities, its score at is a
linear combination of D∗
t (P ) for t ∈ τ, and is thus in the tangent space
T(P ), and we have
d
d
Pn log(p ) = PnD∗
(P ) .
As a consequence, we have d
d PnL(P ) = 0 implies PnD∗(P ) = 0.
Under regularity conditions, we also have {p : } ⊂ M.
19. One-step TMLE of multi-dimensional target parameter
Let p0
n ∈ M be an initial estimator of p0. Let n = arg max Pn log p . Let
p∗
n = p0
n, n
and ψ∗
n = Ψ(P∗
n ). We have
PnD∗
(P∗
n ) = 0.
20. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
21. One-step TMLE of treatment specific survival curve
We investigated the performance of one-step TMLE for treatment specific
survival curve based on O = (W , A, ˜T = min(T, C), ∆ = I(T ≤ C)).
Data structure
• dynamic treatment intervention: W → d(W ).
• Sd (t) is defined by
Ψ(P)(t) = EP [P (T > t|A = d(W ), W )]
• Focus on d(W ) = 1.
22. Efficient influence curve
The efficient influence curve for Ψ(P)(t) is (Hubbard et al., 2000)
D∗
t (P) =
k t
ht(gA, SAc , S)(k, A, W ) I(T = k, ∆ = 1)−
I(T k)λ(k|A = 1, W ) + S(t|A = 1, W ) − Ψ(P)(t)
≡ D∗
1,t(gA, SAc , S) + D∗
2,t(P),
(4)
where
ht(gA, SAc , S)(k, A, W ) =
−
I(A = 1)I(k t)
gA(A = 1|W )SAc (k_|A, W )
S(t|A, W )
S(k|A, W )
.
23. From local least favorable submodel to universal least
favorable submodel
• A local least favorable submodel (LLFM) for Sd (t) around initial
estimator of conditional hazard:
logit(λn,ε(·|A = 1, W )) = logit(λn(·|A = 1, W )) + εht. (5)
• Similarly, we have this local least favorable submodel for a vector
(Sd (t) : t) by adding vector (ht : t) extension.
• These imply, as above, universal least favorable submodels for single
and multidimensional survival function.
24. Simulations for one-step TMLE of survival curve
We investigated the performance of one-step TMLE for treatment specific
survival curve in two simulation settings.
Data structure
• O = (W , A, T) ∼ P0
• A ∈ {0, 1}
• treatment intervention: W → d(W ) = 1
• Sd (t) is defined by
Ψ(P)(t) = EP [P (T > t|A = d(W ), W )]
25. Candidate estimators
1 Kaplan Meier)
2 Iterative TMLE for each single t separately
3 One-step TMLE targeting the whole survival curve Sd
26. Results
0 100 200 300 400
0.00.20.40.60.81.0
Setting I
Time
KM(A=0)
KM(A=1)
truth
iter_TMLE
onestep_curve
initial fit
0.00.20.40.60.81.0
Setting II
KM(A=0)
KM(A=1)
truth
iter_TMLE
onestep_curve
initial fit
27. Monte-carlo results (n = 100)
0 100 200 300 400
0.00.51.01.52.0
Setting I
t
RelativeEfficiency
KM
initial fit
one−step survival curve
iterative TMLE
Figure: Relative efficiency against iterative TMLE, as a function of t
28. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
29. Optimal intervention allocation: “Learn as you go”
Classic Randomized Trial:
Longer implementation, higher cost
Targeted Learning for
Adaptive Trial Designs
ü Is the intervention
effective?
ü For whom?
ü How much will they
benefit?
Analysis
Results
Learn faster,
with fewer
patients
30. Contextual multiple-bandit problem in computer science
Consider a sequence (Wn, Yn(0), Yn(1))n≥1 of i.i.d. random variables with
common probability distribution PF
0 :
• Wn, nth context (possibly high-dimensional)
• Yn(0), nth reward under action a = 0 (in ]0, 1[)
• Yn(1), nth reward under action a = 1 (in ]0, 1[)
We consider a design in which one sequentially,
• observe context Wn
• carry out randomized action An ∈ {0, 1} based on past observations
and Wn
• get the corresponding reward Yn = Yn(An) (other one not revealed),
resulting in an ordered sequence of dependent observations
On = (Wn, An, Yn).
31. Goal of experiment
We want to estimate
• the optimal treatment allocation/action rule d0:
d0(W ) = arg maxa=0,1 E0{Y (a)|W }, which optimizes EYd over all
possible rules d.
• the mean reward under this optimal rule d0:
Ψ(PF
0 ) = E0{Y (d0(W ))},
and we want
• maximally narrow valid confidence intervals (primary) “Statistical. . .
• minimize regret (secondary) 1
n
n
i=1(Yi − Yi (dn)) . . . bandits”
This general contextual multiple bandit problem has enormous range of
applications: e.g., on-line marketing, recommender systems, randomized
clinical trials.
32. Bibliography (non exhaustive!)
• Sequential designs
• Thompson (1933), Robbins (1952)
• specifically in the context of medical trials
- Anscombe (1963), Colton (1963)
- response-adaptive designs: Cornfield et al. (1969), Zelen (1969),
many more since then
• Covariate-adjusted Response-Adaptive (CARA) designs
• Rosenberger et al. (2001), Bandyopadhyay and Biswas (2001), Zhang
et al. (2007), Zhang and Hu (2009), Shao et al (2010). . . typically
study
- convergence of design . . . in correctly specified parametric model
• van der Laan (2008), Chambaz and van der Laan (2013), Zheng,
Chambaz and van der Laan (2015), Bibaut et al (2019) concern
- convergence of design to optimal rule (!), super-learning and HAL-MLE
of optimal rule, and TMLE of optimal reward, with inference, without
(e.g., parametric) assumptions.
33. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
34. General Longitudinal Data Structure
We observe n i.i.d. copies of a longitudinal data structure
O = (L(0), A(0), . . . , L(K), A(K), Y = L(K + 1)),
where A(t) denotes a discrete valued intervention node, L(t) is an
intermediate covariate realized after A(t − 1) and before A(t),
t = 0, . . . , K, and Y is a final outcome of interest.
Survival example: For example,
A(t) = (A1(t), A2(t))
A1(t) = I(Treated at time t)
A2(t) = I(min(T, C) ≤ t, ∆ = 0) right-censoring indicator process
∆ = I(T ≤ C) failure indicator
Y (t) = I(min(T, C) ≤ t, ∆ = 1) survival indicator process
Y (t) ⊂ L(t) Y = Y (K + 1).
35. Likelihood and Statistical Model
The probability distribution P0 of O can be factorized according to the
time-ordering as
p0(O) =
K+1
t=0
p0(L(t) | Pa(L(t)))
K
t=0
p0(A(t) | Pa(A(t)))
≡
K+1
t=0
q0,L(t)(O)
K
t=0
g0,A(t)(O)
≡ q0g0,
where Pa(L(t)) ≡ (¯L(t − 1), ¯A(t − 1)) and Pa(A(t)) ≡ (¯L(t), ¯A(t − 1))
denote the parents of L(t) and A(t) in the time-ordered sequence,
respectively. The g0-factor represents the intervention mechanism.
Statistical Model: We make no assumptions on q0, but could make
assumptions on g0.
36. Statistical Target Parameter: G-computation Formula for
Post-dynamic-Intervention Distribution
• pg∗
0 = q0(o)g∗(o) is the G-computation formula for the
post-intervention distribution of O under the stochastic intervention
g∗ = K
t=0 g∗
A(t)(O).
• In particular, for a dynamic intervention d = (dt : t = 0, . . . , K) with
dt(¯L(t), ¯A(t − 1)) being the treatment at time t, the G-computation
formula is given by
pd
0 (l) =
K+1
t=0
qd
0,L(t)(¯l(t)), (6)
where qd
L(t)(¯l(t)) = qL(t)(l(t) | ¯l(t − 1), ¯A(t − 1) = ¯dt−1(¯l(t − 1))).
• Let Ld = (L(0), Ld (1), . . . , Y d = Ld (K + 1)) denote the random
variable with probability distribution Pd .
37. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
38. A Sequential Regression G-computation Formula (Bang,
Robins, 2005)
• By the iterative conditional expectation rule (tower rule), we have
EPd Y d
= E . . . E(E(Y d
| ¯Ld
(K)) | Ld
(K − 1)) . . . | L(0)).
• In addition, the conditional expectation, given ¯Ld (K) is equivalent
with conditioning on ¯L(K), ¯A(K − 1) = ¯dK−1(¯L(K − 1)).
In this manner, one can represent EPd Y d as an iterative conditional
expectation, first take conditional expectation, given ¯Ld (K) (equivalent
with ¯L(K), ¯A(K − 1)), then take the conditional expectation, given
¯Ld (K − 1) (equivalent with ¯L(K − 1), ¯A(K − 2)), and so on, until the
conditional expectation given L(0), and finally take the mean over L(0).
39. TMLE
• A likelihood based TMLE was developed (van der Laan, Stitelman,
2010).
• A sequential regression TMLE Ψ(Q∗
n) was developed for EYd in van
der Laan, Gruber (2012).
• The latter builds on Bang and Robins (2005) by putting their
innovative double robust efficient estimating equation method, which
uses sequential clever covariate regressions to estimate the nuisance
parameters of estimating equation, into aTMLE framework.
• A TMLE for Euclidean summary measures of (EYd : d ∈ D) defined
by marginal structural working models is developed in Petersen et al.
(2013);
• A new (analogue to sequential regression) TMLE allowing for
continuous valued monitoring and time till event is coming (Rijtgaard,
van der Laan, 2019).
40. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
41. A real-world CER study comparing different rules for
treatment intensification for diabetes
• Data extracted from diabetes registries of 7 HMO research network
sites:
• Kaiser Permanente
• Group Health Cooperative
• HealthPartners
• Enrollment period: Jan 1st 2001 to Jun 30th 2009
Enrollment criteria:
• past A1c< 7% (glucose level) while on 2+ oral agents or basal insulin
• 7% ≤ latest A1c ≤ 8.5% (study entry when glycemia was no longer
reined in)
42. Longitudinal data
• Follow-up til the earliest of Jun 30th 2010, death, health plan
disenrollment, or the failure date
• Failure defined as onset/progression of albuminuria (a microvascular
complication)
• Treatment is the indicator being on ”treatment intensification” (TI)
• n ≈ 51, 000 with a median follow-up of 2.5 years
43. Impact of SL on IPTW
Back to the TI study...
Impact of machine learning on inference with IPW 1:
5 10 15
0.700.750.800.850.900.951.00
Parametric model
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
d7
d7.5
d8
d8.5
5 10 15
0.700.750.800.850.900.951.00
Super Learning
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
5 10 15
0.700.750.800.850.900.951.00
Number ’t’ of 90−day intervals since study entry
Survival−P(T>t)(noweighttruncation)
No/weak evidence of
protective effect
Strong significant
evidence
45. Outline
1 Ingredients of Targeted Learning: Causal Framework, Super-learning,
Targeting
2 Highly Adaptive Lasso (HAL) estimator: candidate for Super-Learner
library
3 Targeted Minimum Loss Based Estimation (TMLE)
4 Universal Least favorable submodels for one-step TMLE
5 Example: One-step TMLE of causal impact of point intervention on
survival
6 Group sequential adaptive RCT to learn optimal rule
7 TMLE of Causal Effects of Multiple Time Point Interventions on
Survival
8 Sequential Regression Representation of Treatment Specific Mean
9 TMLE in complex observational study of diabetes (Neugebauer et al.)
10 Software For Targeted Learning
46. tlverse - Targeted Learning software ecosystem in R
• A curated collection of R packages for Targeted Learning
• Shares a consistent underlying philosophy, grammar, and set of data
structures
• Open source
• Designed for generality, usability, and extensibility
• Microwave dinners for machine learning
47. Targeted Learning
van der Laan & Rose, Targeted Learning: Causal Inference for
Observational and Experimental Data. New York: Springer, 2011.