Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning
Daochen Zha, Kwei-Herng Lai, Qiaoyu Tan, Sirui Ding, Na Zou, Xia Hu

TL;DR
This paper introduces AutoSMOTE, a deep hierarchical reinforcement learning-based method for automated over-sampling in imbalanced learning, which optimizes synthetic sample generation to improve classification performance.
Contribution
It develops a novel hierarchical RL framework to jointly optimize over-sampling decisions, surpassing existing heuristic-based methods in imbalanced learning tasks.
Findings
AutoSMOTE outperforms state-of-the-art resampling algorithms on six datasets.
The hierarchical RL approach effectively models the complex decision space of synthetic sample generation.
Automated optimization leads to better classification metrics across various imbalanced datasets.
Abstract
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class. While numerous over-sampling algorithms have been proposed, they heavily rely on heuristics, which could be sub-optimal since we may need different sampling strategies for different datasets and base classifiers, and they cannot directly optimize the performance metric. Motivated by this, we investigate developing a learning-based over-sampling algorithm to optimize the classification performance, which is a challenging task because of the huge and hierarchical decision space. At the high level, we need to decide how many synthetic samples to generate. At the low level, we need to determine where the synthetic samples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Artificial Intelligence in Healthcare · Machine Learning and Data Classification
MethodsSynthetic Minority Over-sampling Technique. · Balanced Selection
