How many imputations do you need? A two-stage calculation using a quadratic rule
Paul T. von Hippel

TL;DR
This paper introduces a two-stage method to determine the optimal number of imputations in multiple imputation, ensuring stable standard error estimates, especially when the fraction of missing information is high.
Contribution
It proposes a quadratic rule-based two-stage calculation for the number of imputations needed, along with new software tools for implementation.
Findings
Number of imputations needed increases quadratically with missing information fraction.
Two-stage procedure improves replicability of standard error estimates.
Introduces new commands in Stata and SAS for practical application.
Abstract
When using multiple imputation, users often want to know how many imputations they need. An old answer is that 2 to 10 imputations usually suffice, but this recommendation only addresses the efficiency of point estimates. You may need more imputations if, in addition to efficient point estimates, you also want standard error (SE) estimates that would not change (much) if you imputed the data again. For replicable SE estimates, the required number of imputations increases quadratically with the fraction of missing information (not linearly, as previous studies have suggested). I recommend a two-stage procedure in which you conduct a pilot analysis using a small-to-moderate number of imputations, then use the results to calculate the number of imputations that are needed for a final analysis whose SE estimates will have the desired level of replicability. I implement the two-stage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Statistical Research · SAS software applications and methods
