A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification
Claudio M. V. de Andrade, Washington Cunha, Davi Reis, Adriana Silvina, Pagano, Leonardo Rocha, Marcos Andr\'e Gon\c{c}alves

TL;DR
This paper proposes a confidence-based hybrid strategy combining first-generation transformers and open LLMs for cost-effective sentiment analysis, outperforming standalone models and reducing costs.
Contribution
It introduces a novel confidence-based method to integrate 1stTRs and open LLMs, improving performance while lowering computational costs.
Findings
Hybrid approach outperforms individual models in sentiment analysis
Cost savings achieved by using less expensive models for high-confidence cases
Close performance to fine-tuned LLMs at a fraction of the cost
Abstract
Transformer models have achieved state-of-the-art results, with Large Language Models (LLMs), an evolution of first-generation transformers (1stTR), being considered the cutting edge in several NLP tasks. However, the literature has yet to conclusively demonstrate that LLMs consistently outperform 1stTRs across all NLP tasks. This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets. The results indicate that open LLMs may moderately outperform or match 1stTRs in 8 out of 11 datasets but only when fine-tuned. Given this substantial cost for only moderate gains, the practical applicability of these models in cost-sensitive scenarios is questionable. In this context, a confidence-based strategy that seamlessly integrates 1stTRs with open LLMs based on prediction certainty is proposed. High-confidence documents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Softmax · Linear Layer · Attention Dropout · Dropout · WordPiece · Residual Connection · Layer Normalization · Multi-Head Attention · Linear Warmup With Linear Decay
