Generative Modeling under Non-Monotone MAR Missingness via Approximate Wasserstein Gradient Flows
Gitte Kremling, Jeffrey N\"af, Johannes Lederer

TL;DR
This paper introduces FLOWGEM, a novel nonparametric method for generating complete datasets from MAR missing data, using Wasserstein gradient flows to minimize KL divergence.
Contribution
FLOWGEM is a new iterative data generation approach that leverages Wasserstein gradient flows and density ratio estimation to handle non-monotone MAR missingness.
Findings
FLOWGEM achieves state-of-the-art performance in various missing data scenarios.
The method effectively handles complex non-monotone MAR mechanisms.
Simulation and real-data results validate its practical advantages.
Abstract
The prevalence of missing values in data science poses a substantial risk to any further analyses. Despite a wealth of research, principled nonparametric methods to deal with general non-monotone missingness are still scarce. Instead, ad-hoc imputation methods are often used, for which it remains unclear whether the correct distribution can be recovered. In this paper, we propose FLOWGEM, a principled iterative method for generating a complete dataset from a dataset with values Missing at Random (MAR). Motivated by convergence results of the ignoring maximum likelihood estimator, our approach minimizes the expected Kullback-Leibler (KL) divergence between the observed data distribution and the distribution of the generated sample over different missingness patterns. To minimize the KL divergence, we employ a discretized particle evolution of the corresponding Wasserstein Gradient Flow,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
