Generative Imputation and Stochastic Prediction
Mohammad Kachuee, Kimmo Karkkainen, Orpaz Goldstein, Sajad Darabi,, Majid Sarrafzadeh

TL;DR
This paper introduces a generative approach for imputing missing data and estimating class uncertainties, improving classification performance on incomplete datasets across image and tabular data.
Contribution
It proposes a simple generator-discriminator-predictor framework to generate imputations and capture classification uncertainties in the presence of missing data.
Findings
Effective imputation of missing features on image and tabular datasets.
Accurate estimation of class uncertainties with incomplete data.
Improved classification performance under various missingness conditions.
Abstract
In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is synonymous with uncertainties not only over the distribution of missing values but also over target class assignments that require careful consideration. In this paper, we propose a simple and effective method for imputing missing features and estimating the distribution of target assignments given incomplete data. In order to make imputations, we train a simple and effective generator network to generate imputations that a discriminator network is tasked to distinguish. Following this, a predictor network is trained using the imputed samples from the generator network to capture the classification uncertainties and make predictions accordingly. The proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques
