LLMs Know More About Numbers than They Can Say

Fengting Yuchi; Li Du; Jason Eisner

arXiv:2602.07812·cs.CL·February 18, 2026

LLMs Know More About Numbers than They Can Say

Fengting Yuchi, Li Du, Jason Eisner

PDF

Open Access 1 Datasets 1 Video

TL;DR

This paper investigates whether language models truly understand numerical magnitudes by probing their hidden states, revealing they encode number sizes well internally but struggle with explicit ranking tasks, and shows finetuning can improve their numerical reasoning.

Contribution

It demonstrates that LLMs internally encode numerical magnitudes effectively and introduces a method to improve their explicit numerical ranking abilities through auxiliary training.

Findings

01

Hidden states encode log-magnitudes with about 2.3% relative error.

02

Hidden states can rank numerals with over 90% accuracy.

03

Finetuning with a magnitude-based auxiliary loss improves ranking accuracy by 3.22%.

Abstract

Although state-of-the-art LLMs can solve math problems, we find that they make errors on numerical comparisons with mixed notation: "Which is larger, $5.7 \times 1 0^{2}$ or $580$ ?" This raises a fundamental question: Do LLMs even know how big these numbers are? We probe the hidden states of several smaller open-source LLMs. A single linear projection of an appropriate hidden layer encodes the log-magnitudes of both kinds of numerals, allowing us to recover the numbers with relative error of about 2.3% (on restricted synthetic text) or 19.06% (on scientific papers). Furthermore, the hidden state after reading a pair of numerals encodes their ranking, with a linear classifier achieving over 90% accuracy. Yet surprisingly, when explicitly asked to rank the same pairs of numerals, these LLMs achieve only 50-70% accuracy, with worse performance for models whose probes are less effective.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

VCY019/Numeracy-Probing
dataset· 13 dl
13 dl

Videos

LLMs Know More About Numbers than They Can Say· underline

Taxonomy

TopicsMathematics, Computing, and Information Processing · Topic Modeling · History and Theory of Mathematics