Generative Modeling under Non-Monotone MAR Missingness via Approximate Wasserstein Gradient Flows

Gitte Kremling; Jeffrey N\"af; Johannes Lederer

arXiv:2604.04567·stat.ML·May 4, 2026

Generative Modeling under Non-Monotone MAR Missingness via Approximate Wasserstein Gradient Flows

Gitte Kremling, Jeffrey N\"af, Johannes Lederer

PDF

TL;DR

This paper introduces FLOWGEM, a novel nonparametric method for generating complete datasets from MAR missing data, using Wasserstein gradient flows to minimize KL divergence.

Contribution

FLOWGEM is a new iterative data generation approach that leverages Wasserstein gradient flows and density ratio estimation to handle non-monotone MAR missingness.

Findings

01

FLOWGEM achieves state-of-the-art performance in various missing data scenarios.

02

The method effectively handles complex non-monotone MAR mechanisms.

03

Simulation and real-data results validate its practical advantages.

Abstract

The prevalence of missing values in data science poses a substantial risk to any further analyses. Despite a wealth of research, principled nonparametric methods to deal with general non-monotone missingness are still scarce. Instead, ad-hoc imputation methods are often used, for which it remains unclear whether the correct distribution can be recovered. In this paper, we propose FLOWGEM, a principled iterative method for generating a complete dataset from a dataset with values Missing at Random (MAR). Motivated by convergence results of the ignoring maximum likelihood estimator, our approach minimizes the expected Kullback-Leibler (KL) divergence between the observed data distribution and the distribution of the generated sample over different missingness patterns. To minimize the KL divergence, we employ a discretized particle evolution of the corresponding Wasserstein Gradient Flow,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.