Enhancement Encoding: A Novel Imbalanced Classification Approach via Encoding the Training Labels
Jia-Chen Zhao

TL;DR
This paper introduces enhancement encoding, a novel label encoding method designed specifically for imbalanced classification tasks, which improves minority class performance by combining re-weighting and cost-sensitiveness.
Contribution
It proposes enhancement encoding, a new label encoding technique for imbalanced data, and introduces a soft-confusion matrix to reduce validation costs.
Findings
Enhancement encoding significantly improves minority class accuracy.
The method outperforms traditional one-hot encoding in imbalanced scenarios.
It is effective across different loss functions.
Abstract
Class imbalance, which is also called long-tailed distribution, is a common problem in classification tasks based on machine learning. If it happens, the minority data will be overwhelmed by the majority, which presents quite a challenge for data science. To address the class imbalance problem, researchers have proposed lots of methods: some people make the data set balanced (SMOTE), some others refine the loss function (Focal Loss), and even someone has noticed the value of labels influences class-imbalanced learning (Yang and Xu. Rethinking the value of labels for improving class-imbalanced learning. In NeurIPS 2020), but no one changes the way to encode the labels of data yet. Nowadays, the most prevailing technique to encode labels is the one-hot encoding due to its nice performance in the general situation. However, it is not a good choice for imbalanced data, because the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Currency Recognition and Detection · Electricity Theft Detection Techniques
