ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets
Kristian Schultz, Saptarshi Bej, Waldemar Hahn, Markus Wolfien,, Prashant Srivastava, Olaf Wolkenhauer

TL;DR
ConvGeN introduces a novel deep generative model that leverages convex space learning to generate synthetic minority class data, significantly improving classification performance on small imbalanced tabular datasets compared to existing deep models.
Contribution
The paper proposes ConvGeN, a deep generative model that combines convex space learning with deep networks, outperforming existing deep models and matching linear interpolation methods on small imbalanced datasets.
Findings
ConvGeN outperforms existing deep generative models on small imbalanced tabular datasets.
ConvGeN achieves classification performance comparable to linear interpolation approaches.
ConvGeN enhances synthetic tabular data generation beyond imbalanced classification applications.
Abstract
Data is commonly stored in tabular format. Several fields of research are prone to small imbalanced tabular data. Supervised Machine Learning on such data is often difficult due to class imbalance. Synthetic data generation, i.e., oversampling, is a common remedy used to improve classifier performance. State-of-the-art linear interpolation approaches, such as LoRAS and ProWRAS can be used to generate synthetic samples from the convex space of the minority class to improve classifier performance in such cases. Deep generative networks are common deep learning approaches for synthetic sample generation, widely used for synthetic image generation. However, their scope on synthetic tabular data generation in the context of imbalanced classification is not adequately explored. In this article, we show that existing deep generative models perform poorly compared to linear interpolation based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · AI in cancer detection · Imbalanced Data Classification Techniques
