Representational Curvature Modulates Behavioral Uncertainty in Large Language Models
Jack King, Evelina Fedorenko, Eghbal A. Hosseini

TL;DR
This paper links the geometric curvature of representational trajectories in large language models to their behavioral uncertainty, showing how curvature influences next-token entropy and can be modulated through interventions.
Contribution
It establishes a direct relationship between representational curvature and token-level uncertainty, introducing trajectory regularization to reduce entropy without performance loss.
Findings
Curvature correlates with entropy across models and training stages.
Trajectory-aligned perturbations modulate entropy reliably.
Regularizing for straighter trajectories reduces token entropy.
Abstract
In autoregressive large language models (LLMs), temporal straightening offers an account of how the next-token prediction objective shapes representations. Models learn to progressively straighten the representational trajectory of input sequences across layers, potentially facilitating next-token prediction via linear extrapolation. However, a direct link between this trajectory and token-level behavior has been missing. We provide such a link by relating contextual curvature-a geometric measure of how sharply the representational trajectory bends over recent context-to next-token entropy. Across two models (GPT-2 XL and Pythia-2.8B), contextual curvature is correlated with entropy, and this relationship emerges during training. Perturbation experiments reveal selective dependence: manipulating curvature through trajectory-aligned interventions reliably modulates entropy, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
