Impugan: Learning Conditional Generative Models for Robust Data Imputation
Zalish Mahmud, Anantaa Kotal, Aritran Piplai

TL;DR
Impugan introduces a conditional GAN approach for robust data imputation, effectively handling complex, heterogeneous missing data scenarios by capturing nonlinear relationships, outperforming traditional methods in benchmark tests.
Contribution
The paper presents Impugan, a novel cGAN-based model that improves data imputation and integration for heterogeneous datasets, overcoming limitations of linear assumptions in traditional methods.
Findings
Achieves up to 82% lower Earth Mover's Distance compared to baselines.
Reduces mutual-information deviation by 70%.
Effective in multi-source data integration tasks.
Abstract
Incomplete data are common in real-world applications. Sensors fail, records are inconsistent, and datasets collected from different sources often differ in scale, sampling rate, and quality. These differences create missing values that make it difficult to combine data and build reliable models. Standard imputation methods such as regression models, expectation-maximization, and multiple imputation rely on strong assumptions about linearity and independence. These assumptions rarely hold for complex or heterogeneous data, which can lead to biased or over-smoothed estimates. We propose Impugan, a conditional Generative Adversarial Network (cGAN) for imputing missing values and integrating heterogeneous datasets. The model is trained on complete samples to learn how missing variables depend on observed ones. During inference, the generator reconstructs missing entries from available…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis
