Making Pretrained Language Models Good Long-tailed Learners
Chen Zhang, Lei Ren, Jingang Wang, Wei Wu, Dawei Song

TL;DR
This paper investigates prompt-tuning's effectiveness for long-tailed classification, demonstrating it enables pretrained language models to perform well on imbalanced data, with analysis highlighting the importance of classifier structure over input design.
Contribution
It provides empirical evidence that prompt-tuning makes pretrained models effective long-tailed learners and analyzes the underlying reasons, emphasizing classifier structure importance.
Findings
Prompt-tuning improves long-tailed classification performance.
Classifier structure and parameterization are key to success.
Findings extend to few-shot classification scenarios.
Abstract
Prompt-tuning has shown appealing performance in few-shot classification by virtue of its capability in effectively exploiting pre-trained knowledge. This motivates us to check the hypothesis that prompt-tuning is also a promising choice for long-tailed classification, since the tail classes are intuitively few-shot ones. To achieve this aim, we conduct empirical studies to examine the hypothesis. The results demonstrate that prompt-tuning makes pretrained language models at least good long-tailed learners. For intuitions on why prompt-tuning can achieve good performance in long-tailed classification, we carry out in-depth analyses by progressively bridging the gap between prompt-tuning and commonly used finetuning. The summary is that the classifier structure and parameterization form the key to making good long-tailed learners, in comparison with the less important input structure.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare
