A geometric relation of the error introduced by sampling a language model's output distribution to its internal state
Albert F. Modenbach

TL;DR
This paper explores the geometric properties of language model output distributions, revealing how token embedding geometry relates to model sensitivity and internal representations, especially in chess reasoning tasks.
Contribution
It introduces a geometric framework based on an $rak{so}(n)$-valued 1-form that links token embedding geometry to model sensitivity and semantic understanding.
Findings
Curvature of the geometric form correlates with model's semantic reasoning in chess.
Token space geometry reflects internal problem representations.
The geometric approach uncovers meaningful relationships between embeddings and model behavior.
Abstract
GPT-style language models are sensitive to single-token changes at generation points where the predicted probability distribution is spread across multiple tokens. Viewing this sensitivity as a geometric property, we derive an -valued 1-form that depends only on the geometry of the token embeddings. Despite this purely geometric origin, we show that its curvature is semantically meaningful: On chess reasoning tasks, the curvature couples to the world model of an off-the-shelf instruction-tuned model, with transformations clustering by board region and respecting piece importance. Our findings suggest that token space geometry directly reflects how models internally represent problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
