Population-calibrated multiple imputation for a binary/categorical covariate in categorical regression models
Tra My Pham, James R Carpenter, Tim P Morris, Angela M Wood, Irene, Petersen

TL;DR
This paper introduces a population-calibrated multiple imputation method that incorporates external population data to improve inference accuracy for missing categorical data, especially under MNAR mechanisms.
Contribution
The paper proposes a calibrated-$\delta$ adjustment for MI that leverages external population distributions to enhance inference under MNAR, extending standard MI methods.
Findings
Method matches standard MI under MAR
Provides more accurate inference under MNAR
Applied to ethnicity data in diabetes study
Abstract
Multiple imputation (MI) has become popular for analyses with missing data in medical research. The standard implementation of MI is based on the assumption of data being missing at random (MAR). However, for missing data generated by missing not at random (MNAR) mechanisms, MI performed assuming MAR might not be satisfactory. For an incomplete variable in a given dataset, its corresponding population marginal distribution might also be available in an external data source. We show how this information can be readily utilised in the imputation model to calibrate inference to the population, by incorporating an appropriately calculated offset termed the `calibrated- adjustment'. We describe the derivation of this offset from the population distribution of the incomplete variable and show how in applications it can be used to closely (and often exactly) match the post-imputation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
