Rethinking the Diffusion Models for Numerical Tabular Data Imputation from the Perspective of Wasserstein Gradient Flow
Zhichao Chen, Haoxuan Li, Fangyikang Wang, Odin Zhang, Hu Xu, Xiaoyu, Jiang, Zhihuan Song, Eric H. Wang

TL;DR
This paper introduces KnewImp, a novel Wasserstein gradient flow-based method for numerical tabular data imputation that addresses accuracy and training difficulties of diffusion models, outperforming existing methods.
Contribution
The paper proposes KnewImp, a new principled approach using Wasserstein gradient flow and kernel methods to improve imputation accuracy and simplify training in tabular data.
Findings
KnewImp significantly outperforms state-of-the-art imputation methods.
Theoretical analysis links diffusion model issues to cost functional properties.
Eliminates the need for mask matrix in training, simplifying the process.
Abstract
Diffusion models (DMs) have gained attention in Missing Data Imputation (MDI), but there remain two long-neglected issues to be addressed: (1). Inaccurate Imputation, which arises from inherently sample-diversification-pursuing generative process of DMs. (2). Difficult Training, which stems from intricate design required for the mask matrix in model training stage. To address these concerns within the realm of numerical tabular datasets, we introduce a novel principled approach termed Kernelized Negative Entropy-regularized Wasserstein gradient flow Imputation (KnewImp). Specifically, based on Wasserstein gradient flow (WGF) framework, we first prove that issue (1) stems from the cost functionals implicitly maximized in DM-based MDI are equivalent to the MDI's objective plus diversification-promoting non-negative terms. Based on this, we then design a novel cost functional with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFluid Dynamics and Turbulent Flows · Lattice Boltzmann Simulation Studies
MethodsSoftmax · Attention Is All You Need
