Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification
Martin Juan Jos\'e Bucher, Marco Martini

TL;DR
Fine-tuned small LLMs consistently outperform larger zero-shot generative models in text classification across various tasks and categories, demonstrating the effectiveness of task-specific fine-tuning over prompt-based approaches.
Contribution
This paper provides empirical evidence that fine-tuning smaller LLMs surpasses zero-shot generative models in classification tasks and offers an accessible toolkit for easy fine-tuning.
Findings
Fine-tuned LLMs outperform zero-shot models in all tested tasks.
Application-specific fine-tuning yields superior performance.
Toolkit simplifies fine-tuning process for broader users.
Abstract
Generative AI offers a simple, prompt-based alternative to fine-tuning smaller BERT-style LLMs for text classification tasks. This promises to eliminate the need for manually labeled training data and task-specific model training. However, it remains an open question whether tools like ChatGPT can deliver on this promise. In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification. We compare three major generative AI models (ChatGPT with GPT-3.5/GPT-4 and Claude Opus) with several fine-tuned LLMs across a diverse set of classification tasks (sentiment, approval/disapproval, emotions, party positions) and text categories (news, tweets, speeches). We find that fine-tuning with application-specific training data achieves superior performance in all cases. To make this approach more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Sparse Evolutionary Training · Cosine Annealing · Residual Connection · Softmax · Layer Normalization · 15 Ways to Contact How can i speak to someone at Delta Airlines · Byte Pair Encoding · Adam · Attention Dropout
