Imputing Missing Values with External Data
Robert Thiesmeier, Matteo Bottai, Nicola Orsini

TL;DR
This paper introduces a new method and a Stata command for imputing missing data using external study information, enabling data sharing without compromising individual data privacy.
Contribution
The paper presents a novel imputation approach that leverages external data and covariance structures, implemented in a user-friendly Stata command.
Findings
Enables imputation without sharing individual data
Uses linear predictors and covariance matrices from external models
Facilitates collaborative research with missing data
Abstract
Missing data is a common challenge across scientific disciplines. Current imputation methods require the availability of individual data to impute missing values. Often, however, missingness requires using external data for the imputation. In this paper, we introduce a new Stata command, mi impute from, designed to impute missing values using linear predictors and their related covariance matrix from imputation models estimated in one or multiple external studies. This allows for the imputation of any missing values without sharing individual data between studies. We describe the underlying method and present the syntax of mi impute from alongside practical examples of missing data in collaborative research projects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsdemographic modeling and climate adaptation
