Text Classification in the LLM Era -- Where do we stand?
Sowmya Vajjala, Shwetali Shimangaud

TL;DR
This paper evaluates the effectiveness of large language models in text classification across multiple languages, comparing zero-shot, few-shot, and synthetic data approaches, revealing strengths and limitations of each method.
Contribution
It provides a comprehensive comparison of LLM-based classification methods across diverse datasets and languages, highlighting their relative performance and disparities.
Findings
Zero-shot approaches excel in sentiment classification.
Synthetic data from multiple LLMs can outperform zero-shot open LLMs.
Significant performance disparities exist across languages.
Abstract
Large Language Models revolutionized NLP and showed dramatic performance improvements across several tasks. In this paper, we investigated the role of such language models in text classification and how they compare with other approaches relying on smaller pre-trained language models. Considering 32 datasets spanning 8 languages, we compared zero-shot classification, few-shot fine-tuning and synthetic data based classifiers with classifiers built using the complete human labeled dataset. Our results show that zero-shot approaches do well for sentiment classification, but are outperformed by other approaches for the rest of the tasks, and synthetic data sourced from multiple LLMs can build better classifiers than zero-shot open LLMs. We also see wide performance disparities across languages in all the classification scenarios. We expect that these findings would guide practitioners…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Legal Language and Interpretation · Library Science and Information Systems
