How to train your draGAN: A task oriented solution to imbalanced   classification

Leon O. Guertler; Andri Ashfahani; Anh Tuan Luu

arXiv:2211.10065·cs.LG·November 21, 2022

How to train your draGAN: A task oriented solution to imbalanced classification

Leon O. Guertler, Andri Ashfahani, Anh Tuan Luu

PDF

Open Access 1 Repo

TL;DR

This paper introduces draGAN, a novel GAN-based architecture designed specifically for imbalanced classification tasks, generating data to optimize classifier performance rather than data similarity, and demonstrating superior results over existing methods.

Contribution

The paper presents draGAN, a new task-oriented GAN architecture that generates both minority and majority samples to improve classification performance on imbalanced datasets.

Findings

01

draGAN outperforms state-of-the-art SMOTE and GAN-based methods

02

Empirical results on 94 datasets show improved classification accuracy

03

Highlights some limitations of draGAN in certain scenarios

Abstract

The long-standing challenge of building effective classification models for small and imbalanced datasets has seen little improvement since the creation of the Synthetic Minority Over-sampling Technique (SMOTE) over 20 years ago. Though GAN based models seem promising, there has been a lack of purpose built architectures for solving the aforementioned problem, as most previous studies focus on applying already existing models. This paper proposes a unique, performance-oriented, data-generating strategy that utilizes a new architecture, coined draGAN, to generate both minority and majority samples. The samples are generated with the objective of optimizing the classification model's performance, rather than similarity to the real data. We benchmark our approach against state-of-the-art methods from the SMOTE family and competitive GAN based approaches on 94 tabular datasets with varying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leonguertler/dragan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Machine Learning in Healthcare · Artificial Intelligence in Healthcare

MethodsSynthetic Minority Over-sampling Technique.