Value-Aware Numerical Representations for Transformer Language Models

Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu

arXiv:2601.09706·cs.CL·January 15, 2026

Value-Aware Numerical Representations for Transformer Language Models

Andreea Dutulescu, Stefan Ruseti, Mihai Dascalu

PDF

Open Access

TL;DR

This paper proposes a value-aware numerical representation for Transformer language models that explicitly encodes numerical magnitude, improving their arithmetic reasoning and numerical robustness across various tasks and formats.

Contribution

It introduces a novel prefix token embedding conditioned on numerical value, enhancing the model's ability to understand and manipulate numbers.

Findings

01

Outperforms baselines on arithmetic tasks

02

Improves numerical robustness across formats

03

Compatible with existing architectures

Abstract

Transformer-based language models often achieve strong results on mathematical reasoning benchmarks while remaining fragile on basic numerical understanding and arithmetic operations. A central limitation is that numbers are processed as symbolic tokens whose embeddings do not explicitly encode numerical value, leading to systematic errors. We introduce a value-aware numerical representation that augments standard tokenized inputs with a dedicated prefix token whose embedding is explicitly conditioned on the underlying numerical value. This mechanism injects magnitude information directly into the model's input space while remaining compatible with existing tokenizers and decoder-only Transformer architectures. Evaluation on arithmetic tasks shows that the proposed approach outperforms baselines across numerical formats, tasks, and operand lengths. These results indicate that explicitly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Model Reduction and Neural Networks · Topic Modeling