The Silent Vote: Improving Zero-Shot LLM Reliability by Aggregating Semantic Neighborhoods

Sanket Badhe; Priyanka Tiwari; Deep Shah

arXiv:2605.09739·cs.CL·May 12, 2026

The Silent Vote: Improving Zero-Shot LLM Reliability by Aggregating Semantic Neighborhoods

Sanket Badhe, Priyanka Tiwari, Deep Shah

PDF

TL;DR

This paper introduces Semantic Softmax, a method to improve zero-shot LLM classification by aggregating semantic neighborhoods, reducing overconfidence and enhancing calibration and accuracy.

Contribution

It proposes Semantic Softmax, an inference-time layer that recovers lost semantic information during constrained decoding in zero-shot classification.

Findings

01

Semantic Softmax reduces Expected Calibration Error (ECE) and Brier Score.

02

It improves AUROC and Macro-F1 scores across datasets.

03

The method enhances model calibration and discriminative performance.

Abstract

Large Language Models are increasingly used as zero-shot classifiers in complex reasoning tasks. However, standard constrained decoding suffers from a phenomenon we define as Renormalization Bias. When a model is restricted to a small set of target labels, the standard softmax operation discards the probability mass assigned to semantic synonyms in the original distribution. This loss of information, which we call the Silent Vote, results in artificial overconfidence and poor calibration. We propose Semantic Softmax, an inference-time layer that recovers this lost information by aggregating the scores of the semantic neighborhood surrounding each target label. We evaluate this approach on Qwen-3 and Phi-4-mini models using GoEmotions and Civil Comments datasets. Our results demonstrate consistent improvements across all evaluation metrics: Semantic Softmax substantially reduces…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.