Combining Autoregressive and Autoencoder Language Models for Text Classification
Jo\~ao Gon\c{c}alves

TL;DR
This paper introduces CAALM-TC, a hybrid model combining autoregressive and autoencoder language models to improve text classification, especially in small datasets and complex tasks.
Contribution
It presents a novel hybrid approach that leverages autoregressive models for contextual information and autoencoders for classification, outperforming existing methods.
Findings
CAALM outperforms existing models on four benchmark datasets.
The hybrid approach is especially effective with small datasets.
CAALM reduces sample size requirements for accurate classification.
Abstract
This paper presents CAALM-TC (Combining Autoregressive and Autoencoder Language Models for Text Classification), a novel method that enhances text classification by integrating autoregressive and autoencoder language models. Autoregressive large language models such as Open AI's GPT, Meta's Llama or Microsoft's Phi offer promising prospects for content analysis practitioners, but they generally underperform supervised BERT based models for text classification. CAALM leverages autoregressive models to generate contextual information based on input texts, which is then combined with the original text and fed into an autoencoder model for classification. This hybrid approach capitalizes on the extensive contextual knowledge of autoregressive models and the efficient classification capabilities of autoencoders. Experimental results on four benchmark datasets demonstrate that CAALM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · Dense Connections · Cosine Annealing · Adam · Residual Connection · Weight Decay · LLaMA · Byte Pair Encoding
