Difference-in-differences Design with Outcomes Missing Not at Random
Sooahn Shin

TL;DR
This paper introduces a novel method for difference-in-differences analysis with panel data where outcomes are missing not at random, utilizing principal strata and tailored Lee bounds to improve causal effect identification.
Contribution
It proposes a new identification strategy based on principal strata and tailored Lee bounds that handles non-random missingness without assuming independence or homogeneous effects.
Findings
The method accounts for non-random missing data in DID designs.
It relaxes independence and homogeneity assumptions.
Provides partial identification of causal effects under missing not at random.
Abstract
This paper addresses one of the most prevalent problems encountered by political scientists working with difference-in-differences (DID) design: missingness in panel data. A common practice for handling missing data, known as complete case analysis, is to drop cases with any missing values over time. A more principled approach involves using nonparametric bounds on causal effects or applying inverse probability weighting based on baseline covariates. Yet, these methods are general remedies that often under-utilize the assumptions already imposed on panel structure for causal identification. In this paper, I outline the pitfalls of complete case analysis and propose an alternative identification strategy based on principal strata. To be specific, I impose parallel trends assumption within each latent group that shares the same missingness pattern (e.g., always-respondents,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimal Experimental Design Methods
