Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification
Mengyu Li, Yonghao Liu, Fausto Giunchiglia, Ximing Li, Xiaoyue Feng, Renchu Guan

TL;DR
This paper introduces SharpReCL, a novel approach that uses class prototypes and balanced sampling to improve supervised contrastive learning for imbalanced text classification, outperforming large language models.
Contribution
The paper proposes a prototype-based balanced sampling method to enhance contrastive learning in imbalanced text classification tasks.
Findings
SharpReCL outperforms existing models on multiple datasets.
The method effectively handles data imbalance in contrastive learning.
Results surpass those of popular large language models.
Abstract
Text classification is a crucial and fundamental task in web content mining. Compared with the previous learning paradigm of pre-training and fine-tuning by cross entropy loss, the recently proposed supervised contrastive learning approach has received tremendous attention due to its powerful feature learning capability and robustness. Although several studies have incorporated this technique for text classification, some limitations remain. First, many text datasets are imbalanced, and the learning mechanism of supervised contrastive learning is sensitive to data imbalance, which may harm the model's performance. Moreover, these models leverage separate classification branches with cross entropy and supervised contrastive learning branches without explicit mutual guidance. To this end, we propose a novel model named SharpReCL for imbalanced text classification tasks. First, we obtain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsSparse Evolutionary Training · Contrastive Learning
