Divide, Cache, Conquer: Dichotomic Prompting for Efficient Multi-Label LLM-Based Classification
Miko{\l}aj Langner, Jan Eliasz, Ewa Rudnicka, Jan Koco\'n

TL;DR
This paper presents a novel, efficient method for multi-label classification using large language models by reformulating tasks as sequences of yes/no questions, combined with caching, to improve inference speed and accuracy.
Contribution
The paper introduces a dichotomic prompting approach with prefix caching for scalable multi-label classification using LLMs, validated on affective text analysis.
Findings
Significant efficiency gains in inference speed with no accuracy loss.
Improved performance of smaller models after distillation from LLMs.
Method is general and applicable across various domains.
Abstract
We introduce a method for efficient multi-label text classification with large language models (LLMs), built on reformulating classification tasks as sequences of dichotomic (yes/no) decisions. Instead of generating all labels in a single structured response, each target dimension is queried independently, which, combined with a prefix caching mechanism, yields substantial efficiency gains for short-text inference without loss of accuracy. To demonstrate the approach, we focus on affective text analysis, covering 24 dimensions including emotions and sentiment. Using LLM-to-SLM distillation, a powerful annotator model (DeepSeek-V3) provides multiple annotations per text, which are aggregated to fine-tune smaller models (HerBERT-Large, CLARIN-1B, PLLuM-8B, Gemma3-1B). The fine-tuned models show significant improvements over zero-shot baselines, particularly on the dimensions seen during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies · Topic Modeling
