Tackling Diverse Minorities in Imbalanced Classification
Kwei-Herng Lai, Daochen Zha, Huiyuan Chen, Mangesh Bendre, Yuzhong, Chen, Mahashweta Das, Hao Yang, Xia Hu

TL;DR
This paper introduces a novel iterative data augmentation framework using reinforcement learning to generate synthetic minority samples, effectively improving classification in highly imbalanced and diverse minority scenarios.
Contribution
It formulates the data augmentation process as an MDP and employs an actor-critic approach to adaptively generate synthetic samples, addressing the challenge of diverse minority distributions.
Findings
Improved classifier performance on imbalanced datasets.
Effective handling of diverse minority distributions.
Robustness across multiple classifiers and datasets.
Abstract
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. When working with large datasets, the imbalanced issue can be further exacerbated, making it exceptionally difficult to train classifiers effectively. To address the problem, over-sampling techniques have been developed to linearly interpolating data instances between minorities and their neighbors. However, in many real-world scenarios such as anomaly detection, minority instances are often dispersed diversely in the feature space rather than clustered together. Inspired by domain-agnostic data mix-up, we propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. It is non-trivial to develop such a framework, the challenges include source sample selection, mix-up strategy selection, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
