Value-Aware Numerical Representations for Transformer Language Models
Andreea Dutulescu, Stefan Ruseti, Mihai Dascalu

TL;DR
This paper proposes a value-aware numerical representation for Transformer language models that explicitly encodes numerical magnitude, improving their arithmetic reasoning and numerical robustness across various tasks and formats.
Contribution
It introduces a novel prefix token embedding conditioned on numerical value, enhancing the model's ability to understand and manipulate numbers.
Findings
Outperforms baselines on arithmetic tasks
Improves numerical robustness across formats
Compatible with existing architectures
Abstract
Transformer-based language models often achieve strong results on mathematical reasoning benchmarks while remaining fragile on basic numerical understanding and arithmetic operations. A central limitation is that numbers are processed as symbolic tokens whose embeddings do not explicitly encode numerical value, leading to systematic errors. We introduce a value-aware numerical representation that augments standard tokenized inputs with a dedicated prefix token whose embedding is explicitly conditioned on the underlying numerical value. This mechanism injects magnitude information directly into the model's input space while remaining compatible with existing tokenizers and decoder-only Transformer architectures. Evaluation on arithmetic tasks shows that the proposed approach outperforms baselines across numerical formats, tasks, and operand lengths. These results indicate that explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Model Reduction and Neural Networks · Topic Modeling
