Do LLMs Signal When They're Right? Evidence from Neuron Agreement

Kang Chen; Yaoning Wang; Kai Xiong; Zhuoka Feng; Wenhe Sun; Haotian Chen; Yixin Cao

arXiv:2510.26277·cs.CL·October 31, 2025

Do LLMs Signal When They're Right? Evidence from Neuron Agreement

Kang Chen, Yaoning Wang, Kai Xiong, Zhuoka Feng, Wenhe Sun, Haotian Chen, Yixin Cao

PDF

TL;DR

This paper introduces Neuron Agreement Decoding (NAD), a novel internal signal-based method for selecting correct responses in LLMs, enabling early stopping and significant token reduction without external labels.

Contribution

The paper presents NAD, a new unsupervised decoding approach leveraging neuron activation patterns to improve correctness prediction and efficiency in LLM outputs.

Findings

01

NAD matches majority voting on verifiable benchmarks.

02

NAD outperforms Avg@64 on open-ended coding tasks.

03

NAD reduces token usage by 99% with minimal quality loss.

Abstract

Large language models (LLMs) commonly boost reasoning via sample-evaluate-ensemble decoders, achieving label free gains without ground truth. However, prevailing strategies score candidates using only external outputs such as token probabilities, entropies, or self evaluations, and these signals can be poorly calibrated after post training. We instead analyze internal behavior based on neuron activations and uncover three findings: (1) external signals are low dimensional projections of richer internal dynamics; (2) correct responses activate substantially fewer unique neurons than incorrect ones throughout generation; and (3) activations from correct responses exhibit stronger cross sample agreement, whereas incorrect ones diverge. Motivated by these observations, we propose Neuron Agreement Decoding (NAD), an unsupervised best-of-N method that selects candidates using activation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.