Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study
Rose Sisk, Matthew Sperrin, Niels Peek, Maarten van Smeden, Glen P., Martin

TL;DR
This study compares imputation and missing indicator methods for handling missing data in clinical prediction models through simulations, revealing that traditional MI principles may not always apply and that simpler methods can sometimes perform better.
Contribution
The paper provides empirical evidence on the performance of various missing data handling methods in prediction models, highlighting when and how to use imputation or missing indicators.
Findings
Omitting the outcome during imputation at deployment is preferred.
Missing indicators can improve performance but may be harmful if missingness depends on the outcome.
Traditional MI guidelines may not be suitable for prediction models with missing data.
Abstract
Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the gold standard approach, can be challenging to apply in the clinic. Clearly, the outcome cannot be used to impute data at prediction time. Regression imputation (RI) may offer a pragmatic alternative in the prediction context, that is simpler to apply in the clinic. Moreover, the use of missing indicators can handle informative missingness, but it is currently unknown how well they perform within CPMs. Methods: We performed a simulation study where data were generated under various missing data mechanisms to compare the predictive performance of CPMs developed using both imputation methods. We consider deployment scenarios where missing data is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeta-analysis and systematic reviews · demographic modeling and climate adaptation · Statistical Methods and Bayesian Inference
