Barriers to Discrete Reasoning with Transformers: A Survey Across Depth, Exactness, and Bandwidth
Michelle Yuan, Weiyi Sun, Amir H. Rezaeian, Jyotika Singh, Sandip Ghoshal, Yao-Ting Wang, Miguel Ballesteros, Yassine Benajiba

TL;DR
This survey analyzes the theoretical barriers faced by transformer models in discrete reasoning tasks, highlighting structural limitations related to depth, approximation, and communication complexity, and discusses future directions for overcoming these challenges.
Contribution
It provides a unified theoretical framework connecting circuit, approximation, and communication complexity to explain why transformers struggle with exact discrete algorithms.
Findings
Transformers face depth and communication bottlenecks in symbolic computation.
Approximation of discontinuities remains a key challenge.
Structural limitations hinder exact discrete reasoning in current architectures.
Abstract
Transformers have become the foundational architecture for a broad spectrum of sequence modeling applications, underpinning state-of-the-art systems in natural language processing, vision, and beyond. However, their theoretical limitations in discrete reasoning tasks, such as arithmetic, logical inference, and algorithmic composition, remain a critical open problem. In this survey, we synthesize recent studies from three theoretical perspectives: circuit complexity, approximation theory, and communication complexity, to clarify the structural and computational barriers that transformers face when performing symbolic computations. By connecting these established theoretical frameworks, we provide an accessible and unified account of why current transformer architectures struggle to implement exact discrete algorithms, even as they excel at pattern matching and interpolation. We review…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Graph Theory and Algorithms · Multimodal Machine Learning Applications
