Reference based multiple imputation -- what is the right variance and how to estimate it
Jonathan W. Bartlett

TL;DR
This paper examines the appropriate variance estimation methods for reference-based multiple imputation in clinical trials, advocating for the use of the repeated sampling variance and reviewing various estimation approaches.
Contribution
It provides a comprehensive review of variance estimation debates in reference-based multiple imputation and proposes combining bootstrapping with multiple imputation as a practical solution.
Findings
Rubin's variance estimator is biased for reference-based imputation
Repeated sampling variance is more appropriate for inference
Combining bootstrapping with multiple imputation offers a general variance estimation approach
Abstract
Reference based multiple imputation methods have become popular for handling missing data in randomised clinical trials. Rubin's variance estimator is well known to be biased compared to the reference based imputation estimator's true repeated sampling variance. Somewhat surprisingly given the increasingly popularity of these methods, there has been relatively little debate in the literature as to whether Rubin's variance estimator or alternative (smaller) variance estimators targeting the repeated sampling variance are more appropriate. We review the arguments made on both sides of this debate, and conclude that the repeated sampling variance is more appropriate. We review different approaches for estimating the frequentist variance, and suggest a recent proposal for combining bootstrapping with multiple imputation as a widely applicable general solution. At the same time, in light of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
