Causal Inference Methods: Event Studies, DiD, RDD & IV
Difference-in-differences (DiD) methods are widely used to evaluate policy rollouts that affect different places at different times. This guide focuses on practical issues that arise when treatment timing is staggered across units and on modern applied estimators that address those issues.
Definition: A staggered rollout is a treatment that arrives at different units in different periods.
Staggered adoption is common in real-world policies:
Definition: A cohort is the set of units that receive treatment for the first time in the same period.
The common two-way fixed effects (TWFE) regression pools all treated units and times into one coefficient. Under staggered timing, TWFE estimates are a weighted average of many 2x2 DiD comparisons. Some of those comparisons are invalid when treatment effects are dynamic or heterogeneous, producing bias and sometimes even estimates with the wrong sign.
Goodman-Bacon (2021) shows the TWFE coefficient equals a weighted average of every possible 2x2 DiD between cohorts and time periods. These 2x2 pieces fall into three types:
Why (3) is forbidden: the earlier-treated units already carry evolving post-treatment effects, so they are a moving target and contaminate the estimate.
Definition: A forbidden comparison uses as a control group units that are already affected by treatment in the comparison period.
De Chaisemartin and D’Haultfœuille (2020) showed TWFE can place negative weights on some unit-time treatment effects. Consequences:
When TWFE uses an earlier cohort that is still experiencing a rising post-treatment effect as a control for a later-treated cohort, the difference subtracts an upward-moving series and understates the true effect for the later cohort.
Definition: Dynamic effects are treatment effects that change with time since treatment (e.g., build-up over years).
TWFE delivers the Average Treatment effect on the Treated (ATT) only if both conditions hold:
In most policy settings, neither condition holds; therefore, a single TWFE coefficient is usually unreliable.
Already have an account? Sign in
Klíčová slova: Event studies — causal inference, Event studies — finance & time series, Event studies — econometrics & DiD, Difference-in-differences methods & theory, Difference-in-differences applied estimators & issues, Applications, Instrumental variables, Regression discontinuity
Klíčové pojmy: A staggered rollout is treatment arriving at different units in different periods., TWFE can be biased under staggered timing when effects are dynamic or heterogeneous., Goodman-Bacon decomposes TWFE into all 2x2 DiD pieces, some invalid., Forbidden comparisons use already-treated units as controls and contaminate estimates., TWFE can assign negative weights, so its sign can be misleading., Callaway–Sant'Anna constructs group-time ATTs robust to heterogeneity and dynamics., did2s residualises using untreated observations then regresses on treatment., Diagnose by plotting treatment timing and cohort event-time dynamics., Prefer cohort-specific effects or aggregated ATTs over single pooled TWFE., Use cluster-robust or estimator-appropriate inference (bootstrap or analytic).