Missing value imputation with adversarial random forests -- MissARF
Pegah Golchian, Jan Kapar, David S. Watson, Marvin N. Wright

TL;DR
MissARF is a novel, fast, and easy-to-use adversarial random forest-based imputation method that effectively handles missing data, providing both single and multiple imputation with competitive quality and runtime.
Contribution
Introduces MissARF, a new generative adversarial random forest method for efficient and accurate missing value imputation in biostatistics.
Findings
Performs comparably to state-of-the-art methods in imputation quality.
Offers fast runtime with no extra cost for multiple imputation.
Provides both single and multiple imputation options.
Abstract
Handling missing values is a common challenge in biostatistical analyses, typically addressed by imputation methods. We propose a novel, fast, and easy-to-use imputation method called missing value imputation with adversarial random forests (MissARF), based on generative machine learning, that provides both single and multiple imputation. MissARF employs adversarial random forest (ARF) for density estimation and data synthesis. To impute a missing value of an observation, we condition on the non-missing values and sample from the estimated conditional distribution generated by ARF. Our experiments demonstrate that MissARF performs comparably to state-of-the-art single and multiple imputation methods in terms of imputation quality and fast runtime with no additional costs for multiple imputation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
