Discrete Semantic States and Hamiltonian Dynamics in LLM Embedding Spaces
Timo Aukusti Laine

TL;DR
This paper applies mathematical and quantum-inspired frameworks to analyze the structured semantic states in LLM embedding spaces, revealing discrete representations and potential for improved understanding of model behavior.
Contribution
It introduces a Hamiltonian formalism to analyze LLM embeddings, linking semantic states to quantum-inspired concepts and offering new insights into their structure.
Findings
LLM embeddings exhibit discrete semantic states.
Hamiltonian formalism relates cosine similarity to semantic perturbations.
Quantum-inspired analysis suggests new avenues for understanding and mitigating hallucinations.
Abstract
We investigate the structure of Large Language Model (LLM) embedding spaces using mathematical concepts, particularly linear algebra and the Hamiltonian formalism, drawing inspiration from analogies with quantum mechanical systems. Motivated by the observation that LLM embeddings exhibit distinct states, suggesting discrete semantic representations, we explore the application of these mathematical tools to analyze semantic relationships. We demonstrate that the L2 normalization constraint, a characteristic of many LLM architectures, results in a structured embedding space suitable for analysis using a Hamiltonian formalism. We derive relationships between cosine similarity and perturbations of embedding vectors, and explore direct and indirect semantic transitions. Furthermore, we explore a quantum-inspired perspective, deriving an analogue of zero-point energy and discussing potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Topic Modeling · Quantum many-body systems
