Label-Occurrence-Balanced Mixup for Long-tailed Recognition
Shaoyu Zhang, Chen Chen, Xiujuan Zhang, Silong Peng

TL;DR
This paper introduces Label-Occurrence-Balanced Mixup, a data augmentation technique designed to address label imbalance in long-tailed recognition tasks, significantly improving performance on vision and sound benchmarks.
Contribution
It proposes a novel mixup method that maintains balanced label occurrence using class-balanced samplers, enhancing long-tailed data learning.
Findings
Improves mixup effectiveness on imbalanced datasets
Achieves superior results on vision and sound benchmarks
Addresses label suppression in long-tailed recognition
Abstract
Mixup is a popular data augmentation method, with many variants subsequently proposed. These methods mainly create new examples via convex combination of random data pairs and their corresponding one-hot labels. However, most of them adhere to a random sampling and mixing strategy, without considering the frequency of label occurrence in the mixing process. When applying mixup to long-tailed data, a label suppression issue arises, where the frequency of label occurrence for each class is imbalanced and most of the new examples will be completely or partially assigned with head labels. The suppression effect may further aggravate the problem of data imbalance and lead to a poor performance on tail classes. To address this problem, we propose Label-Occurrence-Balanced Mixup to augment data while keeping the label occurrence for each class statistically balanced. In a word, we employ two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Text and Document Classification Technologies · Speech and Audio Processing
MethodsTest · Mixup
