Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot   Generative AI Models in Text Classification

Martin Juan Jos\'e Bucher; Marco Martini

arXiv:2406.08660·cs.CL·August 19, 2024·32 cites

Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification

Martin Juan Jos\'e Bucher, Marco Martini

PDF

Open Access 1 Repo

TL;DR

Fine-tuned small LLMs consistently outperform larger zero-shot generative models in text classification across various tasks and categories, demonstrating the effectiveness of task-specific fine-tuning over prompt-based approaches.

Contribution

This paper provides empirical evidence that fine-tuning smaller LLMs surpasses zero-shot generative models in classification tasks and offers an accessible toolkit for easy fine-tuning.

Findings

01

Fine-tuned LLMs outperform zero-shot models in all tested tasks.

02

Application-specific fine-tuning yields superior performance.

03

Toolkit simplifies fine-tuning process for broader users.

Abstract

Generative AI offers a simple, prompt-based alternative to fine-tuning smaller BERT-style LLMs for text classification tasks. This promises to eliminate the need for manually labeled training data and task-specific model training. However, it remains an open question whether tools like ChatGPT can deliver on this promise. In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification. We compare three major generative AI models (ChatGPT with GPT-3.5/GPT-4 and Claude Opus) with several fine-tuned LLMs across a diverse set of classification tasks (sentiment, approval/disapproval, emotions, party positions) and text categories (news, tweets, speeches). We find that fine-tuning with application-specific training data achieves superior performance in all cases. To make this approach more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mnbucher/text-cls-llms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Sparse Evolutionary Training · Cosine Annealing · Residual Connection · Softmax · Layer Normalization · 15 Ways to Contact How can i speak to someone at Delta Airlines · Byte Pair Encoding · Adam · Attention Dropout