Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores

Zvi N. Badash; Yonatan Belinkov; Moti Freiman

arXiv:2603.22299·cs.LG·March 25, 2026

Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores

Zvi N. Badash, Yonatan Belinkov, Moti Freiman

PDF

Open Access

TL;DR

This paper introduces a lightweight, intra-layer agreement-based uncertainty estimation method for LLMs that outperforms probing, especially in transfer and quantized settings, by analyzing cross-layer internal representations.

Contribution

It proposes a novel, compact intra-layer agreement scoring method for uncertainty estimation that is transferable, robust, and efficient across different models and quantization levels.

Findings

01

Matches probing in-distribution with minimal performance difference.

02

Outperforms probing in cross-dataset transfer scenarios.

03

Remains effective under 4-bit weight quantization.

Abstract

Large language models (LLMs) are often confidently wrong, making reliable uncertainty estimation (UE) essential. Output-based heuristics are cheap but brittle, while probing internal representations is effective yet high-dimensional and hard to transfer. We propose a compact, per-instance UE method that scores cross-layer agreement patterns in internal representations using a single forward pass. Across three models, our method matches probing in-distribution, with mean diagonal differences of at most $- 1.8$ AUPRC percentage points and $+ 4.9$ Brier score points. Under cross-dataset transfer, it consistently outperforms probing, achieving off-diagonal gains up to $+ 2.86$ AUPRC and $+ 21.02$ Brier points. Under 4-bit weight-only quantization, it remains robust, improving over probing by $+ 1.94$ AUPRC points and $+ 5.33$ Brier points on average. Beyond performance, examining specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Topic Modeling