How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs
Guhao Feng, Kai Yang, Yuntian Gu, Xinyue Ai, Shengjie Luo, Jiacheng Sun, Di He, Zhenguo Li, Liwei Wang

TL;DR
This paper investigates how the numerical precision used in large language models affects their ability to perform arithmetic tasks, combining theoretical analysis with empirical experiments to identify key factors influencing mathematical reasoning.
Contribution
It provides a theoretical framework linking numerical precision to arithmetic performance in LLMs and demonstrates the importance of standard precision for efficient reasoning.
Findings
Low numerical precision impairs arithmetic capabilities unless models are super-polynomially large.
Standard numerical precision enables smaller models to perform arithmetic tasks effectively.
Empirical results confirm the theoretical impact of numerical precision on LLMs' mathematical reasoning.
Abstract
Despite the remarkable success of Transformer-based large language models (LLMs) across various domains, understanding and enhancing their mathematical capabilities remains a significant challenge. In this paper, we conduct a rigorous theoretical analysis of LLMs' mathematical abilities, with a specific focus on their arithmetic performances. We identify numerical precision as a key factor that influences their effectiveness in arithmetical tasks. Our results show that Transformers operating with low numerical precision fail to address arithmetic tasks, such as iterated addition and integer multiplication, unless the model size grows super-polynomially with respect to the input length. In contrast, Transformers with standard numerical precision can efficiently handle these tasks with significantly smaller model sizes. We further support our theoretical findings through empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing
MethodsFocus
