From Haystack to Needle: Label Space Reduction for Zero-shot Classification

Nathan Vandemoortele; Bram Steenwinckel; Femke Ongenae; Sofie Van Hoecke

arXiv:2502.08436·cs.CL·November 6, 2025

From Haystack to Needle: Label Space Reduction for Zero-shot Classification

Nathan Vandemoortele, Bram Steenwinckel, Femke Ongenae, Sofie Van Hoecke

PDF

Open Access

TL;DR

This paper introduces Label Space Reduction (LSR), a method that enhances zero-shot classification by iteratively refining label sets using unlabeled data, significantly improving performance across benchmarks.

Contribution

The paper proposes LSR, a novel iterative label space refinement technique that leverages unlabeled data to improve zero-shot classification with large language models.

Findings

01

LSR improves macro-F1 scores by up to 14.2% with Llama-3.1-70B.

02

LSR enhances performance across seven benchmarks.

03

Distillation enables efficient inference by reducing computational overhead.

Abstract

We present Label Space Reduction (LSR), a novel method for improving zero-shot classification performance of Large Language Models (LLMs). LSR iteratively refines the classification label space by systematically ranking and reducing candidate classes, enabling the model to concentrate on the most relevant options. By leveraging unlabeled data with the statistical learning capabilities of data-driven models, LSR dynamically optimizes the label space representation at test time. Our experiments across seven benchmarks demonstrate that LSR improves macro-F1 scores by an average of 7.0% (up to 14.2%) with Llama-3.1-70B and 3.3% (up to 11.1%) with Claude-3.5-Sonnet compared to standard zero-shot classification baselines. To reduce the computational overhead of LSR, which requires an additional LLM call at each iteration, we propose distilling the model into a probabilistic classifier,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification