Classifier Language Models: Unifying Sparse Finetuning and Adaptive Tokenization for Specialized Classification Tasks

Adit Krishnan; Chu Wang; Chris Kong

arXiv:2508.08635·cs.LG·August 13, 2025

Classifier Language Models: Unifying Sparse Finetuning and Adaptive Tokenization for Specialized Classification Tasks

Adit Krishnan, Chu Wang, Chris Kong

PDF

Open Access

TL;DR

This paper introduces a token-driven sparse finetuning method for small language models, enhancing specialized semantic classification tasks by focusing on relevant tokens without adding extra parameters, outperforming existing methods.

Contribution

The work presents a novel token-based sparse finetuning strategy that improves efficiency and performance for specialized classification tasks without increasing model complexity.

Findings

01

Outperforms end-to-end finetuning, LoRA, layer selection, and prefix tuning.

02

Achieves greater stability and halves training costs.

03

Effective across five diverse semantic classification tasks.

Abstract

Semantic text classification requires the understanding of the contextual significance of specific tokens rather than surface-level patterns or keywords (as in rule-based or statistical text classification), making large language models (LLMs) well-suited for this task. However, semantic classification applications in industry, like customer intent detection or semantic role labeling, tend to be highly specialized. They require annotation by domain experts in contrast to general-purpose corpora for pretraining. Further, they typically require high inference throughputs which limits the model size from latency and cost perspectives. Thus, for a range of specialized classification tasks, the preferred solution is to develop customized classifiers by finetuning smaller language models (e.g., mini-encoders, small language models). In this work, we develop a token-driven sparse finetuning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text and Document Classification Technologies · Sentiment Analysis and Opinion Mining