Representational Curvature Modulates Behavioral Uncertainty in Large Language Models

Jack King; Evelina Fedorenko; Eghbal A. Hosseini

arXiv:2604.23985·cs.AI·April 28, 2026

Representational Curvature Modulates Behavioral Uncertainty in Large Language Models

Jack King, Evelina Fedorenko, Eghbal A. Hosseini

PDF

TL;DR

This paper links the geometric curvature of representational trajectories in large language models to their behavioral uncertainty, showing how curvature influences next-token entropy and can be modulated through interventions.

Contribution

It establishes a direct relationship between representational curvature and token-level uncertainty, introducing trajectory regularization to reduce entropy without performance loss.

Findings

01

Curvature correlates with entropy across models and training stages.

02

Trajectory-aligned perturbations modulate entropy reliably.

03

Regularizing for straighter trajectories reduces token entropy.

Abstract

In autoregressive large language models (LLMs), temporal straightening offers an account of how the next-token prediction objective shapes representations. Models learn to progressively straighten the representational trajectory of input sequences across layers, potentially facilitating next-token prediction via linear extrapolation. However, a direct link between this trajectory and token-level behavior has been missing. We provide such a link by relating contextual curvature-a geometric measure of how sharply the representational trajectory bends over recent context-to next-token entropy. Across two models (GPT-2 XL and Pythia-2.8B), contextual curvature is correlated with entropy, and this relationship emerges during training. Perturbation experiments reveal selective dependence: manipulating curvature through trajectory-aligned interventions reliably modulates entropy, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.