MissHDD: Hybrid Deterministic Diffusion for Hetrogeneous Incomplete Data Imputation
Youran Zhou, Mohamed Reda Bouadjenek, Sunil Aryal

TL;DR
MissHDD introduces a hybrid deterministic diffusion approach that effectively imputes incomplete heterogeneous tabular data by separating numerical and categorical features into two specialized generative channels, improving accuracy and stability.
Contribution
The paper presents a novel hybrid deterministic diffusion framework that models mixed-type data with two complementary channels, addressing limitations of existing stochastic diffusion models.
Findings
Achieves higher imputation accuracy across diverse datasets.
Provides more stable and robust sampling trajectories.
Outperforms existing diffusion-based and classical methods in various missing data scenarios.
Abstract
Incomplete data are common in real-world tabular applications, where numerical, categorical, and discrete attributes coexist within a single dataset. This heterogeneous structure presents significant challenges for existing diffusion-based imputation models, which typically assume a homogeneous feature space and rely on stochastic denoising trajectories. Such assumptions make it difficult to maintain conditional consistency, and they often lead to information collapse for categorical variables or instability when numerical variables require deterministic updates. These limitations indicate that a single diffusion process is insufficient for mixed-type tabular imputation. We propose a hybrid deterministic diffusion framework that separates heterogeneous features into two complementary generative channels. A continuous DDIM-based channel provides efficient and stable deterministic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Generative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications
