TL;DR
This paper presents SimpDM, a self-supervised diffusion model designed for tabular data imputation, which improves stability and robustness over existing methods through alignment and data augmentation strategies.
Contribution
The paper introduces SimpDM, a novel self-supervised diffusion model with alignment and data augmentation techniques tailored for robust tabular data imputation.
Findings
SimpDM outperforms state-of-the-art imputation methods in various scenarios.
Self-supervised alignment improves stability of diffusion-based imputation.
State-dependent data augmentation enhances robustness with limited data.
Abstract
The ubiquity of missing data has sparked considerable attention and focus on tabular data imputation methods. Diffusion models, recognized as the cutting-edge technique for data generation, demonstrate significant potential in tabular data imputation tasks. However, in pursuit of diversity, vanilla diffusion models often exhibit sensitivity to initialized noises, which hinders the models from generating stable and accurate imputation results. Additionally, the sparsity inherent in tabular data poses challenges for diffusion models in accurately modeling the data manifold, impacting the robustness of these models for data imputation. To tackle these challenges, this paper introduces an advanced diffusion model named Self-supervised imputation Diffusion Model (SimpDM for brevity), specifically tailored for tabular data imputation tasks. To mitigate sensitivity to noise, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Focus · Diffusion
