Deep Learning Meets Oversampling: A Learning Framework to Handle   Imbalanced Classification

Sukumar Kishanthan; Asela Hevapathige

arXiv:2502.06878·cs.LG·February 12, 2025

Deep Learning Meets Oversampling: A Learning Framework to Handle Imbalanced Classification

Sukumar Kishanthan, Asela Hevapathige

PDF

Open Access

TL;DR

This paper introduces a novel deep learning framework that integrates data oversampling directly into the training process, improving classification performance on imbalanced datasets by generating synthetic data in a data-driven manner.

Contribution

The paper proposes a new learning framework that formulates oversampling as a decision-based process, enhancing model representation for imbalanced classification tasks.

Findings

01

Outperforms state-of-the-art algorithms on imbalanced datasets

02

Generates synthetic data in a data-driven manner

03

Improves model representation and classification accuracy

Abstract

Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue, traditional sampling techniques are often decoupled from the training phase of the predictive model, resulting in suboptimal representations. To address this, we propose a novel learning framework that can generate synthetic data instances in a data-driven manner. The proposed framework formulates the oversampling process as a composition of discrete decision criteria, thereby enhancing the representation power of the model's learning process. Extensive experiments on the imbalanced classification task demonstrate the superiority of our framework over state-of-the-art algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Coding and Health Information · Imbalanced Data Classification Techniques