Multiple imputation in data that grow over time: A comparison of three strategies
X.M. Kavelaars, S. van Buuren, J.R. van Ginkel

TL;DR
This paper compares three strategies for applying multiple imputation to longitudinal data that grow over time, analyzing their validity and bias through simulation and empirical example.
Contribution
It provides a systematic comparison of re-imputation, appended, and nested imputation strategies for longitudinal datasets with missing data.
Findings
All techniques yield valid inference with monotone missingness.
Non-monotone missingness can cause bias depending on data correlation structure.
Append imputation performs well in datasets with dropout.
Abstract
Multiple imputation is a highly recommended technique to deal with missing data, but the application to longitudinal datasets can be done in multiple ways. When a new wave of longitudinal data arrives, we can treat the combined data of multiple waves as a new missing data problem and overwrite existing imputations with new values (re-imputation). Alternatively, we may keep the existing imputations, and impute only the new data. We may do either a full multiple imputation (nested) or a single imputation (appended) on the new data per imputed set. This study compares these three strategies by means of simulation. All techniques resulted in valid inference under a monotone missingness pattern. A non-monotone missingness pattern led to biased and non-confidence valid regression coefficients after nested and appended imputation, depending on the correlation structure of the data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · demographic modeling and climate adaptation
