Shattered Compositionality: Counterintuitive Learning Dynamics of Transformers for Arithmetic

Xingyu Zhao; Darsh Sharma; Rheeya Uppaal; Yiqiao Zhong

arXiv:2601.22510·cs.LG·February 2, 2026

Shattered Compositionality: Counterintuitive Learning Dynamics of Transformers for Arithmetic

Xingyu Zhao, Darsh Sharma, Rheeya Uppaal, Yiqiao Zhong

PDF

Open Access

TL;DR

This paper investigates how transformers learn arithmetic skills, revealing they often acquire skills in non-human-like ways, leading to errors, especially under distribution shifts, due to reliance on correlational rather than causal learning mechanisms.

Contribution

It uncovers the phenomenon of shattered compositionality in transformers, showing their learning dynamics differ from human-like sequential skill acquisition, and provides insights into underlying causes.

Findings

01

Transformers often learn skills in reverse order or in parallel.

02

Shattered compositionality persists across modern LLMs and is unaffected by scaling.

03

Correlational matching to training data influences learning dynamics.

Abstract

Large language models (LLMs) often exhibit unexpected errors or unintended behavior, even at scale. While recent work reveals the discrepancy between LLMs and humans in skill compositions, the learning dynamics of skill compositions and the underlying cause of non-human behavior remain elusive. In this study, we investigate the mechanism of learning dynamics by training transformers on synthetic arithmetic tasks. Through extensive ablations and fine-grained diagnostic metrics, we discover that transformers do not reliably build skill compositions according to human-like sequential rules. Instead, they often acquire skills in reverse order or in parallel, which leads to unexpected mixing errors especially under distribution shifts--a phenomenon we refer to as shattered compositionality. To explain these behaviors, we provide evidence that correlational matching to the training data,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Topic Modeling · Machine Learning in Materials Science