Beyond Confidence: The Rhythms of Reasoning in Generative Models

Deyuan Liu; Zecheng Wang; Zhanyue Qin; Zhiying Tu; Dianhui Chu; Dianbo Sui

arXiv:2602.10816·cs.CL·February 12, 2026

Beyond Confidence: The Rhythms of Reasoning in Generative Models

Deyuan Liu, Zecheng Wang, Zhanyue Qin, Zhiying Tu, Dianhui Chu, Dianbo Sui

PDF

Open Access

TL;DR

This paper introduces the Token Constraint Bound ($elta_{ ext{TCB}}$), a new metric to measure the internal stability of Large Language Models, revealing prediction sensitivities overlooked by traditional metrics like perplexity.

Contribution

We propose $elta_{ ext{TCB}}$, a novel stability metric linked to embedding geometry, to assess and improve LLM prediction robustness against input perturbations.

Findings

01

$elta_{ ext{TCB}}$ correlates with prompt engineering effectiveness.

02

It uncovers prediction instabilities missed by perplexity.

03

$elta_{ ext{TCB}}$ provides insights into LLM internal state resilience.

Abstract

Large Language Models (LLMs) exhibit impressive capabilities yet suffer from sensitivity to slight input context variations, hampering reliability. Conventional metrics like accuracy and perplexity fail to assess local prediction robustness, as normalized output probabilities can obscure the underlying resilience of an LLM's internal state to perturbations. We introduce the Token Constraint Bound ( $δ_{TCB}$ ), a novel metric that quantifies the maximum internal state perturbation an LLM can withstand before its dominant next-token prediction significantly changes. Intrinsically linked to output embedding space geometry, $δ_{TCB}$ provides insights into the stability of the model's internal predictive commitment. Our experiments show $δ_{TCB}$ correlates with effective prompt engineering and uncovers critical prediction instabilities missed by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Computational and Text Analysis Methods