Directed Metric Structures arising in Large Language Models

St\'ephane Gaubert; Yiannis Vlassopoulos

arXiv:2405.12264·cs.LG·May 22, 2024·2 cites

Directed Metric Structures arising in Large Language Models

St\'ephane Gaubert, Yiannis Vlassopoulos

PDF

Open Access

TL;DR

This paper uncovers a mathematical metric structure underlying large language models' probability distributions, revealing a tropical geometric framework that encodes text relationships and extensions.

Contribution

It introduces a novel metric polyhedron framework for analyzing text in language models, connecting probability, geometry, and category theory without explicit reliance on the latter.

Findings

01

Text extensions form isometric polyhedra

02

Text vectors can be approximated as Boltzmann weighted combinations

03

The metric structure relates to the Isbell completion and lattice closure

Abstract

Large Language Models are transformer neural networks which are trained to produce a probability distribution on the possible next words to given texts in a corpus, in such a way that the most likely word predicted is the actual word in the training text. In this paper we find what is the mathematical structure defined by such conditional probability distributions of text extensions. Changing the view point from probabilities to -log probabilities we observe that the subtext order is completely encoded in a metric structure defined on the space of texts $L$ , by -log probabilities. We then construct a metric polyhedron $P (L)$ and an isometric embedding (called Yoneda embedding) of $L$ into $P (L)$ such that texts map to generators of certain special extremal rays. We explain that $P (L)$ is a $(min, +)$ (tropical) linear span of these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOpinion Dynamics and Social Influence