A Novel Hybrid Sampling Framework for Imbalanced Learning
Asif Newaz, Farhan Shahriyar Haq

TL;DR
This paper introduces a novel hybrid sampling framework combining multiple techniques to improve classification performance on imbalanced datasets, demonstrating superior results across diverse datasets.
Contribution
The study proposes a new hybrid sampling algorithm, SMOTE-RUS-NC, and integrates it into an ensemble framework, SRN-BRF, to effectively handle class imbalance.
Findings
Outperforms existing sampling methods on 26 datasets.
Achieves significant improvements on highly imbalanced data.
Demonstrates robustness and superior accuracy in diverse scenarios.
Abstract
Class imbalance is a frequently occurring scenario in classification tasks. Learning from imbalanced data poses a major challenge, which has instigated a lot of research in this area. Data preprocessing using sampling techniques is a standard approach to deal with the imbalance present in the data. Since standard classification algorithms do not perform well on imbalanced data, the dataset needs to be adequately balanced before training. This can be accomplished by oversampling the minority class or undersampling the majority class. In this study, a novel hybrid sampling algorithm has been proposed. To overcome the limitations of the sampling techniques while ensuring the quality of the retained sampled dataset, a sophisticated framework has been developed to properly combine three different sampling techniques. Neighborhood Cleaning rule is first applied to reduce the imbalance. Random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Electricity Theft Detection Techniques · Text and Document Classification Technologies
MethodsSynthetic Minority Over-sampling Technique.
