Imputation of Missing Data with Class Imbalance using Conditional Generative Adversarial Networks
Saqib Ejaz Awan, Mohammed Bennamoun, Ferdous Sohel, Frank M, Sanfilippo, Girish Dwivedi

TL;DR
This paper introduces CGAIN, a class-conditioned GAN-based method for imputing missing data that accounts for class imbalance, outperforming existing imputation techniques on benchmark datasets.
Contribution
The paper presents CGAIN, a novel class-specific imputation approach using Conditional GANs, addressing limitations of existing methods that ignore class characteristics.
Findings
CGAIN outperforms state-of-the-art imputation methods on benchmark datasets.
Class-specific modeling improves imputation accuracy in imbalanced datasets.
The approach effectively captures class characteristics for better data estimation.
Abstract
Missing data is a common problem faced with real-world datasets. Imputation is a widely used technique to estimate the missing data. State-of-the-art imputation approaches, such as Generative Adversarial Imputation Nets (GAIN), model the distribution of observed data to approximate the missing values. Such an approach usually models a single distribution for the entire dataset, which overlooks the class-specific characteristics of the data. Class-specific characteristics are especially useful when there is a class imbalance. We propose a new method for imputing missing data based on its class-specific characteristics by adapting the popular Conditional Generative Adversarial Networks (CGAN). Our Conditional Generative Adversarial Imputation Network (CGAIN) imputes the missing data using class-specific distributions, which can produce the best estimates for the missing values. We tested…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
