Language Models Learn Universal Representations of Numbers and Here's Why You Should Care
Michal \v{S}tef\'anik, Timothee Mickus, Marek Kadl\v{c}\'ik, Bertram H{\o}jer, Michal Spiegel, Ra\'ul V\'azquez, Aman Sinha, Josef Kucha\v{r}, Philipp Mondorf, Pontus Stenetorp

TL;DR
This paper demonstrates that large language models develop universal sinusoidal representations of numbers, which are crucial for understanding their numeric encoding and improving arithmetic accuracy.
Contribution
It quantifies the universality of sinusoidal number representations across LLMs and shows how enhancing this property improves numerical reasoning.
Findings
LLMs develop nearly identical sinusoidal number representations
Number representations are interchangeable across different LLMs
Enhancing sinusoidality reduces LLMs' arithmetic errors
Abstract
Prior work has shown that large language models (LLMs) often converge to accurate input embedding for numbers, based on sinusoidal representations. In this work, we quantify that these representations are in fact strikingly systematic, to the point of being almost perfectly universal: different LLM families develop equivalent sinusoidal structures, and number representations are broadly interchangeable in a large swathe of experimental setups. We show that properly factoring in this characteristic is crucial when it comes to assessing how accurately LLMs encode numeric and other ordinal information, and that mechanistically enhancing this sinusoidality can also lead to reductions of LLMs' arithmetic errors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
