Large Language Models For Text Classification: Case Study And Comprehensive Review
Arina Kostina, Marios D. Dikaiakos, Dimosthenis Stefanidis, George, Pallis

TL;DR
This paper evaluates various large language models for text classification tasks, comparing their performance, prompting strategies, and inference times across different scenarios, highlighting their strengths and limitations.
Contribution
It provides a comprehensive comparison of LLMs with traditional models in classification, analyzing performance, prompting effects, and practical trade-offs.
Findings
LLMs like Llama3 and GPT-4 outperform traditional models in complex tasks.
Prompting strategies significantly affect model responses.
Simpler models are more efficient for binary classification tasks.
Abstract
Unlocking the potential of Large Language Models (LLMs) in data classification represents a promising frontier in natural language processing. In this work, we evaluate the performance of different LLMs in comparison with state-of-the-art deep-learning and machine-learning models, in two different classification scenarios: i) the classification of employees' working locations based on job reviews posted online (multiclass classification), and 2) the classification of news articles as fake or not (binary classification). Our analysis encompasses a diverse range of language models differentiating in size, quantization, and architecture. We explore the impact of alternative prompting techniques and evaluate the models based on the weighted F1-score. Also, we examine the trade-off between performance (F1-score) and time (inference response time) for each language model to provide a more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsAttention Is All You Need · Absolute Position Encodings · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
