We develop sensitivity analyses for weak nulls in matched observational studies while allowing unit-level treatment effects to vary. In contrast to randomized experiments and paired observational studies, we show for general matched designs that over a large class of test statistics, any valid sensitivity analysis for the weak null must be unnecessarily conservative if Fisher's sharp null of no treatment effect for any individual also holds. We present a sensitivity analysis valid for the weak null, and illustrate why it is conservative if the sharp null holds through connections to inverse probability weighted estimators. An alternative procedure is presented that is asymptotically sharp if treatment effects are constant, and is valid for the weak null under additional assumptions which may be deemed reasonable by practitioners. The methods may be applied to matched observational studies constructed using any optimal without-replacement matching algorithm, allowing practitioners to assess robustness to hidden bias while allowing for treatment effect heterogeneity.
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observational Studies - Colin Fogarty, December 11, 2019
1. Testing Weak Nulls in Matched Observational
Studies
Colin Fogarty
Massachusetts Institute of Technology
December 11, 2019
2. Robustness of Randomization Tests
Randomization tests provide exact tests for sharp null hypotheses,
the most common of which is Fisher’s sharp null:
HF : yi (0) = yi (1) for all i
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
3. Robustness of Randomization Tests
Randomization tests provide exact tests for sharp null hypotheses,
the most common of which is Fisher’s sharp null:
HF : yi (0) = yi (1) for all i
Concern: are randomization tests of sharp nulls prone to
misinterpretation?
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
4. Robustness of Randomization Tests
Randomization tests provide exact tests for sharp null hypotheses,
the most common of which is Fisher’s sharp null:
HF : yi (0) = yi (1) for all i
Concern: are randomization tests of sharp nulls prone to
misinterpretation?
Perhaps a researcher will use a randomization test, but then
think that she/he has evidence for the existence of a positive
average effect if it rejects...
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
5. Robustness of Randomization Tests
Randomization tests provide exact tests for sharp null hypotheses,
the most common of which is Fisher’s sharp null:
HF : yi (0) = yi (1) for all i
Concern: are randomization tests of sharp nulls prone to
misinterpretation?
Perhaps a researcher will use a randomization test, but then
think that she/he has evidence for the existence of a positive
average effect if it rejects...
Related issue: permutation tests are often viewed by
practitioners as nonparametric alternatives to t-tests when the
two samples don’t look normally distributed
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
6. Randomization Tests for Weak Nulls
Consider a completely randomized experiment under the finite
population model. Suppose I want to use a randomization test to
test the weak null hypothesis
HN : ¯y(1) − ¯y(0) = 0.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
7. Randomization Tests for Weak Nulls
Consider a completely randomized experiment under the finite
population model. Suppose I want to use a randomization test to
test the weak null hypothesis
HN : ¯y(1) − ¯y(0) = 0.
Cannot use the randomization test based upon the observed
treated-minus-control difference in means, ˆτ
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
8. Randomization Tests for Weak Nulls
Consider a completely randomized experiment under the finite
population model. Suppose I want to use a randomization test to
test the weak null hypothesis
HN : ¯y(1) − ¯y(0) = 0.
Cannot use the randomization test based upon the observed
treated-minus-control difference in means, ˆτ
Can use the randomization test with the studentized difference
in means,
ˆτ
s2
T /nT + s2
C /nC
,
where s2
T and s2
C are the sample variances of the observed
outcomes in the treated and control groups
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
9. Randomization Tests for Weak Nulls
The studentized randomization test yields a single, unified mode of
inference under the finite population model that is
1 Exact under Fisher’s sharp null
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
10. Randomization Tests for Weak Nulls
The studentized randomization test yields a single, unified mode of
inference under the finite population model that is
1 Exact under Fisher’s sharp null
2 Asymptotically conservative under Neyman’s weak null
Conservative inference is a fundamental property of inference
under the finite population model with heterogeneous effects
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
11. Randomization Tests for Weak Nulls
The studentized randomization test yields a single, unified mode of
inference under the finite population model that is
1 Exact under Fisher’s sharp null
2 Asymptotically conservative under Neyman’s weak null
Conservative inference is a fundamental property of inference
under the finite population model with heterogeneous effects
While the sharp and weak nulls are surely different, the studentized
randomization test obviates the distinction for practitioners
See Loh et al. (2017); Wu and Ding (2018). Also see Chung and
Romano (2013) for related developments for permutation tests.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
12. What’s Different for Observational Studies?
For randomized experiments, E(ˆτ) = 0 under both the sharp and
weak nulls. But regardless of assuming the sharp or weak null, we
would not expect E(ˆτ) = 0 in a matched observational study.
Why? Unmeasured confounding.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
13. What’s Different for Observational Studies?
For randomized experiments, E(ˆτ) = 0 under both the sharp and
weak nulls. But regardless of assuming the sharp or weak null, we
would not expect E(ˆτ) = 0 in a matched observational study.
Why? Unmeasured confounding.
In observational studies, we assess the robustness of our study’s
findings to unmeasured confounding through a sensitivity analysis
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
14. What’s Different for Observational Studies?
For randomized experiments, E(ˆτ) = 0 under both the sharp and
weak nulls. But regardless of assuming the sharp or weak null, we
would not expect E(ˆτ) = 0 in a matched observational study.
Why? Unmeasured confounding.
In observational studies, we assess the robustness of our study’s
findings to unmeasured confounding through a sensitivity analysis
We’ll review a model proposed by Rosenbaum for such an
analysis in matched observational studies
Existing methods under the model primarily focus on testing
sharp null hypotheses.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
15. What’s Different for Observational Studies?
For randomized experiments, E(ˆτ) = 0 under both the sharp and
weak nulls. But regardless of assuming the sharp or weak null, we
would not expect E(ˆτ) = 0 in a matched observational study.
Why? Unmeasured confounding.
In observational studies, we assess the robustness of our study’s
findings to unmeasured confounding through a sensitivity analysis
We’ll review a model proposed by Rosenbaum for such an
analysis in matched observational studies
Existing methods under the model primarily focus on testing
sharp null hypotheses.
What if effect heterogeneity induces larger discrepancies between
¯y(1) − ¯y(0) and the worst-case expectation of ˆτ?
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
16. Notation for Matched Studies
The ith
of B matched sets has ni individuals. N = B
i=1 ni
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
17. Notation for Matched Studies
The ith
of B matched sets has ni individuals. N = B
i=1 ni
Individual j in set i has observed covariates, xij and an
unobserved covariate, 0 ≤ uij ≤ 1.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
18. Notation for Matched Studies
The ith
of B matched sets has ni individuals. N = B
i=1 ni
Individual j in set i has observed covariates, xij and an
unobserved covariate, 0 ≤ uij ≤ 1.
Using an optimal, without-replacement matching algorithm,
treated and control individuals are placed into matched sets such
that xij ≈ xij .
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
19. Notation for Matched Studies
The ith
of B matched sets has ni individuals. N = B
i=1 ni
Individual j in set i has observed covariates, xij and an
unobserved covariate, 0 ≤ uij ≤ 1.
Using an optimal, without-replacement matching algorithm,
treated and control individuals are placed into matched sets such
that xij ≈ xij .
Consider post-strata with one treated individual; ni − 1 controls
(what follows immediately extends to full matching).
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
20. Notation for Matched Studies
The ith
of B matched sets has ni individuals. N = B
i=1 ni
Individual j in set i has observed covariates, xij and an
unobserved covariate, 0 ≤ uij ≤ 1.
Using an optimal, without-replacement matching algorithm,
treated and control individuals are placed into matched sets such
that xij ≈ xij .
Consider post-strata with one treated individual; ni − 1 controls
(what follows immediately extends to full matching).
yij (1) and yij (0) are the potential outcomes under treatment and
control for individual j in set i.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
21. Notation for Matched Studies
The ith
of B matched sets has ni individuals. N = B
i=1 ni
Individual j in set i has observed covariates, xij and an
unobserved covariate, 0 ≤ uij ≤ 1.
Using an optimal, without-replacement matching algorithm,
treated and control individuals are placed into matched sets such
that xij ≈ xij .
Consider post-strata with one treated individual; ni − 1 controls
(what follows immediately extends to full matching).
yij (1) and yij (0) are the potential outcomes under treatment and
control for individual j in set i.
τij = yij (1) − yij (0) is the treatment effect for each individual
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
22. Notation for Matched Studies
What do we observe?
Zij is the treatment indicator (1 treated, 0 control).
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
23. Notation for Matched Studies
What do we observe?
Zij is the treatment indicator (1 treated, 0 control).
Yij = Zij yij (1) + (1 − Zij )yij (0) is the observed response
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
24. Notation for Matched Studies
What do we observe?
Zij is the treatment indicator (1 treated, 0 control).
Yij = Zij yij (1) + (1 − Zij )yij (0) is the observed response
ˆτi = ni
j=1 Zij Yij − ni
j=1(1 − Zij )Yij /(ni − 1)
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
25. Notation for Matched Studies
What do we observe?
Zij is the treatment indicator (1 treated, 0 control).
Yij = Zij yij (1) + (1 − Zij )yij (0) is the observed response
ˆτi = ni
j=1 Zij Yij − ni
j=1(1 − Zij )Yij /(ni − 1)
¯τi = ni
j=1 τij /ni is the average of the treatment effects in set i.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
26. Notation for Matched Studies
What do we observe?
Zij is the treatment indicator (1 treated, 0 control).
Yij = Zij yij (1) + (1 − Zij )yij (0) is the observed response
ˆτi = ni
j=1 Zij Yij − ni
j=1(1 − Zij )Yij /(ni − 1)
¯τi = ni
j=1 τij /ni is the average of the treatment effects in set i.
¯τ = ni
j=1(ni /N)¯τi is the sample average treatment effect.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
27. Notation for Matched Studies
What do we observe?
Zij is the treatment indicator (1 treated, 0 control).
Yij = Zij yij (1) + (1 − Zij )yij (0) is the observed response
ˆτi = ni
j=1 Zij Yij − ni
j=1(1 − Zij )Yij /(ni − 1)
¯τi = ni
j=1 τij /ni is the average of the treatment effects in set i.
¯τ = ni
j=1(ni /N)¯τi is the sample average treatment effect.
Let F = {yij (1), yij (0), xij , uij }, Ω = {z : ni
j=1 zij = 1},
Z = {Z ∈ Ω}
Inference moving forwards will condition upon F and Z.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
28. A Simple Model for Hidden Bias
Let πij = pr(Zij = 1 | F). A simple model for hidden bias states that
πij = pr(Zij = 1 | xij , uij ), with 0 ≤ uij ≤ 1 and
log
πij
1 − πij
= κi + log(Γ)uij .
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
29. A Simple Model for Hidden Bias
Let πij = pr(Zij = 1 | F). A simple model for hidden bias states that
πij = pr(Zij = 1 | xij , uij ), with 0 ≤ uij ≤ 1 and
log
πij
1 − πij
= κi + log(Γ)uij .
Γ ≥ 1 controls the impact of hidden bias on treatment
assignment.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
30. A Simple Model for Hidden Bias
Let πij = pr(Zij = 1 | F). A simple model for hidden bias states that
πij = pr(Zij = 1 | xij , uij ), with 0 ≤ uij ≤ 1 and
log
πij
1 − πij
= κi + log(Γ)uij .
Γ ≥ 1 controls the impact of hidden bias on treatment
assignment.
Equivalently, for individuals j, j in the same matched set i,
1
Γ
≤
πij (1 − πij )
πij (1 − πij )
≤ Γ
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
31. A Simple Model for Hidden Bias
Let ij = pr(Zij = 1 | F, Z)
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
32. A Simple Model for Hidden Bias
Let ij = pr(Zij = 1 | F, Z)
Conditions on the matched structure, removing dependence on
the nuisance parameters κi
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
33. A Simple Model for Hidden Bias
Let ij = pr(Zij = 1 | F, Z)
Conditions on the matched structure, removing dependence on
the nuisance parameters κi
At Γ = 1, ij = 1/ni .
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
34. A Simple Model for Hidden Bias
Let ij = pr(Zij = 1 | F, Z)
Conditions on the matched structure, removing dependence on
the nuisance parameters κi
At Γ = 1, ij = 1/ni .
Recovers a finely stratified experiment (one treated, ni − 1
controls, ni equiprobable assignments).
Entitles one to modes of inference justified under those designs
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
35. A Simple Model for Hidden Bias
Let ij = pr(Zij = 1 | F, Z)
Conditions on the matched structure, removing dependence on
the nuisance parameters κi
At Γ = 1, ij = 1/ni .
Recovers a finely stratified experiment (one treated, ni − 1
controls, ni equiprobable assignments).
Entitles one to modes of inference justified under those designs
Γ > 1 allows for departures from the idealized finely stratified
experiment that matching seeks to emulate,
ij =
exp{log(Γ)uij }
ni
k=1 exp{log(Γ)uik}
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
36. Sensitivity Analysis Under the Sharp Null
Let δij be the observed value of the treated-minus-control paired
difference if individual j receives the treatment.
δij = yij (1) −
j =j
yij (0)/(ni − 1),
Under the sharp null, yij (1) = yij (0) and the observed response Yij
impute the yij (0), and hence δij .
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
37. Sensitivity Analysis Under the Sharp Null
Let δij be the observed value of the treated-minus-control paired
difference if individual j receives the treatment.
δij = yij (1) −
j =j
yij (0)/(ni − 1),
Under the sharp null, yij (1) = yij (0) and the observed response Yij
impute the yij (0), and hence δij .
Consider the randomization distribution based upon ˆτ,
pr(ˆτ ≥ a | F, Z) =
z∈Ω
1{ˆτ ≥ a}pr(Z = z | F, Z)
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
38. Sensitivity Analysis Under the Sharp Null
Let δij be the observed value of the treated-minus-control paired
difference if individual j receives the treatment.
δij = yij (1) −
j =j
yij (0)/(ni − 1),
Under the sharp null, yij (1) = yij (0) and the observed response Yij
impute the yij (0), and hence δij .
Consider the randomization distribution based upon ˆτ,
pr(ˆτ ≥ a | F, Z) =
z∈Ω
1{ˆτ ≥ a}pr(Z = z | F, Z)
Even under the sharp null, we can’t directly use this randomization
distribution in observational studies because pr(Z = z | F, Z) is
unknown.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
39. Sensitivity Analysis Under the Sharp Null
For a given value of Γ, a sensitivity analysis tries to construct a
random variable TΓ such that under the sharp null
pr(ˆτ ≥ a | F, Z) ≤ pr(TΓ ≥ a | F, Z)
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
40. Sensitivity Analysis Under the Sharp Null
For a given value of Γ, a sensitivity analysis tries to construct a
random variable TΓ such that under the sharp null
pr(ˆτ ≥ a | F, Z) ≤ pr(TΓ ≥ a | F, Z)
This bounding random variable is used to upper bound p-values
Can be done exactly and tractably in a paired design
Not tractable for general matched designs, so one typically
resorts to asymptotic approximations (Gastwirth 2000)
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
41. Sensitivity Analysis Under the Sharp Null
For a given value of Γ, a sensitivity analysis tries to construct a
random variable TΓ such that under the sharp null
pr(ˆτ ≥ a | F, Z) ≤ pr(TΓ ≥ a | F, Z)
This bounding random variable is used to upper bound p-values
Can be done exactly and tractably in a paired design
Not tractable for general matched designs, so one typically
resorts to asymptotic approximations (Gastwirth 2000)
Iteratively increase Γ until one can no longer reject the null. Attests
to the robustness of the study’s finding to hidden bias
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
42. A Unified Procedure?
Constructing the worst-case random variable makes heavy use of the
sharp null
Finding worst-case treatment assignment probabilities requires
knowledge of what the outcomes would have been for all
assignments
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
43. A Unified Procedure?
Constructing the worst-case random variable makes heavy use of the
sharp null
Finding worst-case treatment assignment probabilities requires
knowledge of what the outcomes would have been for all
assignments
Suppose we assume that the sensitivity model holds at Γ ≥ 1. Can
we find a single, unified non-randomized hypothesis test ϕ(α, Γ) (1
if reject, 0 otherwise) such that, under suitable regularity conditions
and for any u
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
44. A Unified Procedure?
Constructing the worst-case random variable makes heavy use of the
sharp null
Finding worst-case treatment assignment probabilities requires
knowledge of what the outcomes would have been for all
assignments
Suppose we assume that the sensitivity model holds at Γ ≥ 1. Can
we find a single, unified non-randomized hypothesis test ϕ(α, Γ) (1
if reject, 0 otherwise) such that, under suitable regularity conditions
and for any u
1 Under the sharp null, lim sup E{ϕ(α, Γ) | F, Z} ≤ α, with
equality possible
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
45. A Unified Procedure?
Constructing the worst-case random variable makes heavy use of the
sharp null
Finding worst-case treatment assignment probabilities requires
knowledge of what the outcomes would have been for all
assignments
Suppose we assume that the sensitivity model holds at Γ ≥ 1. Can
we find a single, unified non-randomized hypothesis test ϕ(α, Γ) (1
if reject, 0 otherwise) such that, under suitable regularity conditions
and for any u
1 Under the sharp null, lim sup E{ϕ(α, Γ) | F, Z} ≤ α, with
equality possible
2 Under the weak null, lim sup E{ϕ(α, Γ) | F, Z} ≤ α, with
equality possible
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
46. A Unified Procedure?
Constructing the worst-case random variable makes heavy use of the
sharp null
Finding worst-case treatment assignment probabilities requires
knowledge of what the outcomes would have been for all
assignments
Suppose we assume that the sensitivity model holds at Γ ≥ 1. Can
we find a single, unified non-randomized hypothesis test ϕ(α, Γ) (1
if reject, 0 otherwise) such that, under suitable regularity conditions
and for any u
1 Under the sharp null, lim sup E{ϕ(α, Γ) | F, Z} ≤ α, with
equality possible
2 Under the weak null, lim sup E{ϕ(α, Γ) | F, Z} ≤ α, with
equality possible
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
47. Success for Paired Designs at Γ > 1
Define n new random variables
DΓi = ˆτi −
Γ − 1
1 + Γ
|ˆτi |,
For any Γ, ˆτi has worst-case expectation {(Γ − 1)/(1 + Γ)}|ˆτi |
under sharp null.
Consider the standard error estimate
se( ¯DΓ) =
1
n(n − 1)
n
i=1
(DΓi − ¯DΓ)2.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
48. Success for Paired Designs at Γ > 1
Define n new random variables
DΓi = ˆτi −
Γ − 1
1 + Γ
|ˆτi |,
For any Γ, ˆτi has worst-case expectation {(Γ − 1)/(1 + Γ)}|ˆτi |
under sharp null.
Consider the standard error estimate
se( ¯DΓ) =
1
n(n − 1)
n
i=1
(DΓi − ¯DΓ)2.
Consider the test
ϕ(α, Γ) = 1 ¯DΓ/se( ¯DΓ) ≥ Φ−1
(1 − α) ,
where Φ(·) is the standard normal CDF.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
49. Success for Paired Designs at Γ > 1
An Asymptotic Sensitivity Analysis for Neyman’s Null
Under mild regularity conditions, if the sensitivity model holds at Γ
and the weak null holds
lim sup
n→∞
E{ϕ(α, Γ) | F, Z} ≤ α.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
50. Success for Paired Designs at Γ > 1
An Asymptotic Sensitivity Analysis for Neyman’s Null
Under mild regularity conditions, if the sensitivity model holds at Γ
and the weak null holds
lim sup
n→∞
E{ϕ(α, Γ) | F, Z} ≤ α.
Furthermore, one can replace the standard normal reference
distribution with a studentized randomization distribution (with
biased assignment probabilities governed by Γ) such that
1 Under the sharp null, E{ϕ(α, Γ) | F, Z} ≤ α for any sample
size, with equality possible
2 Under the weak null, lim sup E{ϕ(α, Γ) | F, Z} ≤ α, with
equality possible
See Fogarty (2019+) for more details.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
51. Γ > 1, General Matched Designs
For a given Γ, consider as a test statistic any monotone increasing
function of the stratumwise differences in means hΓ,ni
(ˆτi )
May depend upon Γ and ni
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
52. Γ > 1, General Matched Designs
For a given Γ, consider as a test statistic any monotone increasing
function of the stratumwise differences in means hΓ,ni
(ˆτi )
May depend upon Γ and ni
Let µΓi be the worst-case expectation for hΓ,ni
(ˆτi ) under the sharp
null hypothesis
Easy to compute through asymptotic separability (could also just
solve the LP).
µΓi is random under the weak null, varying with Z.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
53. An Impossibility Result
Consider
E N−1
B
i=1
{hΓ,ni
(ˆτi ) − µΓi }
Bounded above by zero under sharp null if model holds at Γ,
with equality possible. What about the weak null?
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
54. An Impossibility Result
Consider
E N−1
B
i=1
{hΓ,ni
(ˆτi ) − µΓi }
Bounded above by zero under sharp null if model holds at Γ,
with equality possible. What about the weak null?
Failure to Control the Worst-Case Expectation
Suppose the sensitivity model holds at Γ. Then, for any choice of
functions {hΓni
}ni ≥2, there exist combinations of stratum sizes,
potential outcomes satisfying the weak null, and unmeasured
confounders u such that
lim inf
B→∞
N−1
E{hΓ,ni
(ˆτi ) − µΓi } > 0
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
56. Implications
The expectations cannot be simultaneously tightly controlled
If a sensitivity analysis could be asymptotically correct (rather
than conservative) under the sharp null, it must only be valid for
a subset of the weak null
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
57. Implications
The expectations cannot be simultaneously tightly controlled
If a sensitivity analysis could be asymptotically correct (rather
than conservative) under the sharp null, it must only be valid for
a subset of the weak null
If a sensitivity analysis is valid for the entirety of the weak null,
it must be unnecessarily conservative if only the sharp null holds.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
58. Implications
The expectations cannot be simultaneously tightly controlled
If a sensitivity analysis could be asymptotically correct (rather
than conservative) under the sharp null, it must only be valid for
a subset of the weak null
If a sensitivity analysis is valid for the entirety of the weak null,
it must be unnecessarily conservative if only the sharp null holds.
Using a studentized test statistic does nothing to help! Has to
do with misalignment of worst-case expectations
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
59. Valid Inference for the Weak Null
For a given Γ, the conditional probabilities ij are bounded as
1
Γ(ni − 1) + 1
≤ ij ≤
Γ
ni − 1 + Γ
,
and are further constrained by ni
j=1 ij = 1.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
60. Valid Inference for the Weak Null
For a given Γ, the conditional probabilities ij are bounded as
1
Γ(ni − 1) + 1
≤ ij ≤
Γ
ni − 1 + Γ
,
and are further constrained by ni
j=1 ij = 1.
If we knew ij , an unbiased estimator for ¯τi given F, Z would be
IPWi =
1
ni
ˆτi
ni
j=1 Zij ij
At Γ = 1, IPWi = ˆτi .
For Γ > 1, can’t use IPWi for any i as ij are unknown
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
61. Worst-Case Weighting
We can’t use IPWi as πi is unknown. Consider instead
WΓi =
1
ni
ˆτi
Γ
ni −1+Γ
1(ˆτi ≥ 0) + 1
1+Γ(ni −1)
1(ˆτi < 0)
.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
62. Worst-Case Weighting
We can’t use IPWi as πi is unknown. Consider instead
WΓi =
1
ni
ˆτi
Γ
ni −1+Γ
1(ˆτi ≥ 0) + 1
1+Γ(ni −1)
1(ˆτi < 0)
.
Weights ˆτi by the worst-case assignment probability if the sensitivity
model holds at Γ
E(WΓi | F, Z) ≤ ¯τi
E
B
i=1
(ni /n)WΓi | F, Z ≤ ¯τ
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
63. Why Might This Be Conservative?
Suppose ni = 3, the potential values for ˆτi given F, Z are
δi1 = 5, δi2 = −2, δi3 = −3, and we conduct inference at Γ = 2.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
64. Why Might This Be Conservative?
Suppose ni = 3, the potential values for ˆτi given F, Z are
δi1 = 5, δi2 = −2, δi3 = −3, and we conduct inference at Γ = 2.
What would the worst-case probabilities employed by W2i be?
W2i =
1
3
ˆτi
1
2
1(ˆτi ≥ 0) + 1
5
1(ˆτi < 0)
.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
65. Why Might This Be Conservative?
Suppose ni = 3, the potential values for ˆτi given F, Z are
δi1 = 5, δi2 = −2, δi3 = −3, and we conduct inference at Γ = 2.
What would the worst-case probabilities employed by W2i be?
W2i =
1
3
ˆτi
1
2
1(ˆτi ≥ 0) + 1
5
1(ˆτi < 0)
.
δi1 = 5 ⇒ ∗
i1 = 1/2
δi2 = −2 ⇒ ∗
i2 = 1/5
δi3 = −3 ⇒ ∗
i3 = 1/5
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
66. Why Might This Be Conservative?
Suppose ni = 3, the potential values for ˆτi given F, Z are
δi1 = 5, δi2 = −2, δi3 = −3, and we conduct inference at Γ = 2.
What would the worst-case probabilities employed by W2i be?
W2i =
1
3
ˆτi
1
2
1(ˆτi ≥ 0) + 1
5
1(ˆτi < 0)
.
δi1 = 5 ⇒ ∗
i1 = 1/2
δi2 = −2 ⇒ ∗
i2 = 1/5
δi3 = −3 ⇒ ∗
i3 = 1/5
ni
j=1
∗
ij = 9/10. Not a probability distribution!
If model holds at Γ = 2, E(W2i | F, Z) < 0.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
67. Incompatibility
In general, the worst-case IPW estimator WΓi does not generate
worst-case probabilities corresponding to a valid probability
distribution
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
68. Incompatibility
In general, the worst-case IPW estimator WΓi does not generate
worst-case probabilities corresponding to a valid probability
distribution
Under the sharp null, we can find the worst-case probability
distribution because we know δij for j = 1, ..., ni
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
69. Incompatibility
In general, the worst-case IPW estimator WΓi does not generate
worst-case probabilities corresponding to a valid probability
distribution
Under the sharp null, we can find the worst-case probability
distribution because we know δij for j = 1, ..., ni
WΓi is unduly conservative for sharp null, can be improved upon
by weighting with the worst-case distribution.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
70. Incompatibility
In general, the worst-case IPW estimator WΓi does not generate
worst-case probabilities corresponding to a valid probability
distribution
Under the sharp null, we can find the worst-case probability
distribution because we know δij for j = 1, ..., ni
WΓi is unduly conservative for sharp null, can be improved upon
by weighting with the worst-case distribution.
Under the weak null, we can’t impose the constraint! We only
know δij for one of the ni individuals.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
71. Incompatibility
In general, the worst-case IPW estimator WΓi does not generate
worst-case probabilities corresponding to a valid probability
distribution
Under the sharp null, we can find the worst-case probability
distribution because we know δij for j = 1, ..., ni
WΓi is unduly conservative for sharp null, can be improved upon
by weighting with the worst-case distribution.
Under the weak null, we can’t impose the constraint! We only
know δij for one of the ni individuals.
Any attempt at adjusting ∗
ij runs the risk of yielding liberal
inference.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
72. Incompatibility
In general, the worst-case IPW estimator WΓi does not generate
worst-case probabilities corresponding to a valid probability
distribution
Under the sharp null, we can find the worst-case probability
distribution because we know δij for j = 1, ..., ni
WΓi is unduly conservative for sharp null, can be improved upon
by weighting with the worst-case distribution.
Under the weak null, we can’t impose the constraint! We only
know δij for one of the ni individuals.
Any attempt at adjusting ∗
ij runs the risk of yielding liberal
inference.
Incompatibility not a deficiency of matching - prevalent in many
modes of sensitivity analysis
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
73. An Alternative Weighting
Consider the alternative weighting
˜WΓi =
1
ni
ˆτi
2Γ
ni (1+Γ)
1(ˆτi ≥ 0) + 2
ni (1+Γ)
1(ˆτi < 0)
.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
74. An Alternative Weighting
Consider the alternative weighting
˜WΓi =
1
ni
ˆτi
2Γ
ni (1+Γ)
1(ˆτi ≥ 0) + 2
ni (1+Γ)
1(ˆτi < 0)
.
Relative to before...
1
Γ(ni − 1) + 1
⇒
2
ni (1 + Γ)
(larger)
Γ
ni − 1 + Γ
⇒
2Γ
ni (1 + Γ)
(smaller)
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
75. Expectation under the Sharp Null
This weighting scheme has nice properties under the sharp null
The Worst-Case Expectation Under the Sharp Null
Suppose the sensitivity model holds at Γ. Then,
E( ˜WΓi | F, Z) ≤ 0,
and equality holds when uij = 1{δij ≥ 0}
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
76. Expectation under the Sharp Null
This weighting scheme has nice properties under the sharp null
The Worst-Case Expectation Under the Sharp Null
Suppose the sensitivity model holds at Γ. Then,
E( ˜WΓi | F, Z) ≤ 0,
and equality holds when uij = 1{δij ≥ 0}
So, taking a weighted average, under the sharp null
E
B
i=1
(ni /N) ˜WΓi | F, Z ≤ 0,
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
77. Expectation under the Sharp Null
This weighting scheme has nice properties under the sharp null
The Worst-Case Expectation Under the Sharp Null
Suppose the sensitivity model holds at Γ. Then,
E( ˜WΓi | F, Z) ≤ 0,
and equality holds when uij = 1{δij ≥ 0}
So, taking a weighted average, under the sharp null
E
B
i=1
(ni /N) ˜WΓi | F, Z ≤ 0,
Because it sharply bounds the expectation under the sharp null, it
must only do so for a subset of the weak null!
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
78. The Expectation Under the Weak Null
For other matched designs, if the weak null holds but the sharp null
does not,
E
B
i=1
(ni /N) ˜WΓi
≤ CΓ
B
i=1
(ni /N) 1 +
1 − Γ
Γ
pr(ˆτi ≥ ¯τi | F, Z) ¯τi
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
79. The Expectation Under the Weak Null
For other matched designs, if the weak null holds but the sharp null
does not,
E
B
i=1
(ni /N) ˜WΓi
≤ CΓ
B
i=1
(ni /N) 1 +
1 − Γ
Γ
pr(ˆτi ≥ ¯τi | F, Z) ¯τi
A sufficient condition for the worst-case expectation under the weak
null to be controlled under the weak null is that, for the worst-case
confounder,
cov {pr(ˆτi ≥ ¯τi | F, Z), ni ¯τi } ≥ 0,
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
80. Reasonable?
Consider uij = {δij ≥ ¯τi }. The sufficient condition can be rewritten as
cov
ni
ni
j=1{1(δij < ¯τi ) + Γ1(δij ≥ ¯τi )}
, ni (¯τi ) ≤ 0.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
81. Reasonable?
Consider uij = {δij ≥ ¯τi }. The sufficient condition can be rewritten as
cov
ni
ni
j=1{1(δij < ¯τi ) + Γ1(δij ≥ ¯τi )}
, ni (¯τi ) ≤ 0.
The first term is a function of the stratum size ni and
ni
j=1 1(δij − ¯τi ≥ 0); Second term is ni ¯τi
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
82. Reasonable?
Consider uij = {δij ≥ ¯τi }. The sufficient condition can be rewritten as
cov
ni
ni
j=1{1(δij < ¯τi ) + Γ1(δij ≥ ¯τi )}
, ni (¯τi ) ≤ 0.
The first term is a function of the stratum size ni and
ni
j=1 1(δij − ¯τi ≥ 0); Second term is ni ¯τi
ni
ni
j=1(δij − ¯τi )¯τi = 0 for all i!
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
83. Reasonable?
Consider uij = {δij ≥ ¯τi }. The sufficient condition can be rewritten as
cov
ni
ni
j=1{1(δij < ¯τi ) + Γ1(δij ≥ ¯τi )}
, ni (¯τi ) ≤ 0.
The first term is a function of the stratum size ni and
ni
j=1 1(δij − ¯τi ≥ 0); Second term is ni ¯τi
ni
ni
j=1(δij − ¯τi )¯τi = 0 for all i!
Covariance may be nonzero only because residuals being uncorrelated
from fitted values does not imply that functions of residuals are
uncorrelated from fitted values
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
84. Reasonable?
Consider uij = {δij ≥ ¯τi }. The sufficient condition can be rewritten as
cov
ni
ni
j=1{1(δij < ¯τi ) + Γ1(δij ≥ ¯τi )}
, ni (¯τi ) ≤ 0.
The first term is a function of the stratum size ni and
ni
j=1 1(δij − ¯τi ≥ 0); Second term is ni ¯τi
ni
ni
j=1(δij − ¯τi )¯τi = 0 for all i!
Covariance may be nonzero only because residuals being uncorrelated
from fitted values does not imply that functions of residuals are
uncorrelated from fitted values
Do we care?
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
85. Reasonable?
Are we worried about hidden bias acting in this way?
To be anticonservative, skewness of δij − ¯τi must relate to values
of ¯τi
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
86. Reasonable?
Are we worried about hidden bias acting in this way?
To be anticonservative, skewness of δij − ¯τi must relate to values
of ¯τi
If we decide this doesn’t matter, sensitivity analysis using the test
statistic B
i=1(ni /N) ˜WΓi yields a single procedure that...
Controls the worst-case expectation under the sharp null, with
equality possible
Controls the worst-case expectation under a (potentially
benign?) subset of the weak null.
Variance estimation, asymptotic normality follows naturally from
Fogarty (2018); Pashley and Miratrix (2019+).
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
87. Conclusions
In randomized experiments, randomization tests using suitably
studentized differences-in-means typically yield a single mode of
inference for both the sharp (exact) and weak (asymptotic) nulls
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
88. Conclusions
In randomized experiments, randomization tests using suitably
studentized differences-in-means typically yield a single mode of
inference for both the sharp (exact) and weak (asymptotic) nulls
Continues to be true in a sensitivity analysis in paired
observational studies with a careful choice of test statistic
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
89. Conclusions
In randomized experiments, randomization tests using suitably
studentized differences-in-means typically yield a single mode of
inference for both the sharp (exact) and weak (asymptotic) nulls
Continues to be true in a sensitivity analysis in paired
observational studies with a careful choice of test statistic
Modes of inference cannot be unified in the same way when
conducting a sensitivity analysis for general forms of matching
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
90. Conclusions
In randomized experiments, randomization tests using suitably
studentized differences-in-means typically yield a single mode of
inference for both the sharp (exact) and weak (asymptotic) nulls
Continues to be true in a sensitivity analysis in paired
observational studies with a careful choice of test statistic
Modes of inference cannot be unified in the same way when
conducting a sensitivity analysis for general forms of matching
If we truly require correctness over the entirety of the weak null,
we must be conservative if the sharp null is true
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
91. Conclusions
In randomized experiments, randomization tests using suitably
studentized differences-in-means typically yield a single mode of
inference for both the sharp (exact) and weak (asymptotic) nulls
Continues to be true in a sensitivity analysis in paired
observational studies with a careful choice of test statistic
Modes of inference cannot be unified in the same way when
conducting a sensitivity analysis for general forms of matching
If we truly require correctness over the entirety of the weak null,
we must be conservative if the sharp null is true
Presented a test statistic that can be exact for the sharp null,
and valid over a potentially innocuous subset of the weak null
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
92. Thanks!
Fogarty, C.B. Testing weak nulls in matched observational
studies. arXiv.
Fogarty, C.B. (2019+) Studentized sensitivity analysis for the
sample average treatment effect in paired observational studies.
Journal of the American Statistical Association.
Fogarty, C.B. (2018) On mitigating the analytical limitations of
finely stratified experiments Journal of the Royal Statistical
Society, Series B.
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
93. Paired Designs
For paired designs, ˜WΓi equals WΓi , and is equivalent to the unifying
procedure described in Fogarty (2019+).
Paired Designs
Recall that DΓi = ˆτi − {(Γ − 1)/(1 + Γ)}|ˆτi |. ˜WΓi and DΓi are
proportional,
WΓi =
(1 + Γ)2
4Γ
DΓi ,
such that E B−1 B
i=1
˜WΓi | F, Z ≤ 0 under the weak null
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
94. The Propensity Score as a Nuisance Parameter
For inference on average treatment effects under strong ignorability,
the propensity score, e(x) = pr(Z = 1 | X = x), is a nuisance
parameter.
Not of primary interest, but important for inference
Inverse Propensity Weighting, AIPW,...
Use a plug-in estimator, ˆe(x)
Pay attention to the rate of convergence of ˆe(x) and its impact
on resulting inference
How does matching try to handle nuisance parameters?
A mix of conditioning (dealing with observed covariates) and
optimizing (sensitivity to unmeasured confounders)
Imperfections in the strategy will be be investigated here
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
95. A Simple Model with Hidden Bias
Suppose treatment is strongly ignorable given (X, U) where U is
unobserved, and consider the following model probability of
assignment to treatment
logit{pr(Z = 1 | X = x, U = u)} = xβ + log(Γ)u
x are observed covariates
u ∈ [0, 1] is an unobserved scalar covariate
Γ is a sensitivity parameter
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
96. Conditional Permutation Tests
Let Ω = {z : zT
x = a}, where a is observed value of ZT
x in the
study, and consider conditioning upon Z ∈ Ω
Rosenbaum (1984) with hidden bias
pr(Z = z | x, u, Z ∈ Ω) =
exp{log(Γ)zT
u}
b∈Ω exp{log(Γ)bT u}
ZT
x is sufficient for β, so conditioning removes dependence on β
Inference still depends on u
Example: Suppose x ∈ RN×B
, where the ith of B columns contains
an indicator for membership in the ith stratum
βi = stratum-specific slope
(ZT
x)i = number of treated in ith stratum
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
97. Similarities with Matching
Consider optimal matching without replacement.
ith matched set is of size ni
mi is the number of treated individuals in matched set i
Letting Ω = {z : ni
j=1 zij = mi }, Rosenbaum (2002) suggests using
the following model for biased treatment assignments
pr(Z = z | x, u, Z ∈ Ω) =
exp{log(Γ)zT
u}
b∈Ω exp{log(Γ)bT u}
,
aligning with the randomization distribution on the previous slide
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019
98. Matching as a Search for Sufficiency
Modify the logit form to allow for nonlinearities in x
logit{pr(Z = 1 | X = x, U = u)} = ϕ(x) + log(Γ)u
In using the conditional inference described in Rosenbaum (2002), we
assume
1 ϕ(xij ) = ϕ(xik) for individuals j and k in the same matched set i
2 The matched structure would be invariant over z ∈ Ω
Of course, these won’t hold in practice
1 Discrepancies on ϕ(xij ) are inevitable (induce bias)
2 Had I observed a different z ∈ Ω, I may have attained a different
match (affects variance)
How much of a difference can this make? Working paper...
Colin Fogarty Sensitivity Analysis for Weak Nulls December 11, 2019