Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures
Fu-Chieh Chang, You-Chen Lin, Pei-Yuan Wu

TL;DR
This paper investigates how large language models perform arithmetic by learning algebraic structures, providing empirical and theoretical evidence that such structures enable better generalization and understanding of arithmetic reasoning.
Contribution
It introduces the idea that LLMs learn algebraic structures like commutativity and identity, supported by empirical datasets and theoretical analysis of transformer embeddings.
Findings
LLMs can learn algebraic properties from data.
Transformer embeddings can be invariant to permutations and identity elements.
Leveraging algebraic structures improves LLMs' arithmetic reasoning.
Abstract
Large language models (LLMs) have demonstrated remarkable mathematical capabilities, largely driven by chain-of-thought (CoT) prompting, which decomposes complex reasoning into step-by-step solutions. This approach has enabled significant advancements, as evidenced by performance on benchmarks like GSM8K and MATH. However, the mechanisms underlying LLMs' ability to perform arithmetic in a single step of CoT remain poorly understood. Existing studies debate whether LLMs encode numerical values or rely on symbolic reasoning, while others explore attention and multi-layered processing in arithmetic tasks. In this work, we propose that LLMs learn arithmetic by capturing algebraic structures, such as commutativity and identity properties. Since these structures are observable through input-output relationships, they can generalize to unseen data. We empirically demonstrate that LLMs can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsSoftmax · Attention Is All You Need
