Algorithmic Phase Transitions in Language Models: A Mechanistic Case Study of Arithmetic
Alan Sun, Ethan Sun, Warren Shepard

TL;DR
This paper investigates how large language models switch problem-solving strategies in arithmetic tasks, revealing that instability in these strategies may hinder their zero-shot generalization capabilities.
Contribution
It introduces the concept of algorithmic stability in language models and demonstrates its impact on their ability to generalize across related tasks.
Findings
Gemma-2-2b uses different computational models for similar arithmetic tasks.
Algorithmic instability may limit zero-shot performance in logical reasoning.
Models struggle to abstract and transition between different problem-solving strategies.
Abstract
Zero-shot capabilities of large language models make them powerful tools for solving a range of tasks without explicit training. It remains unclear, however, how these models achieve such performance, or why they can zero-shot some tasks but not others. In this paper, we shed some light on this phenomenon by defining and investigating algorithmic stability in language models -- changes in problem-solving strategy employed by the model as a result of changes in task specification. We focus on a task where algorithmic stability is needed for generalization: two-operand arithmetic. Surprisingly, we find that Gemma-2-2b employs substantially different computational models on closely related subtasks, i.e. four-digit versus eight-digit addition. Our findings suggest that algorithmic instability may be a contributing factor to language models' poor zero-shot performance across certain logical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsFocus
