Understanding Transformer Reasoning Capabilities via Graph Algorithms
Clayton Sanford, Bahare Fatemi, Ethan Hall, Anton Tsitsulin, Mehran, Kazemi, Jonathan Halcrow, Bryan Perozzi, Vahab Mirrokni

TL;DR
This paper explores the algorithmic reasoning capabilities of transformers, identifying the scaling regimes needed for different problems and demonstrating their effectiveness on graph reasoning tasks through theory and experiments.
Contribution
It introduces a representational hierarchy classifying problems by their solvability with transformers under various parameter regimes and provides both theoretical proofs and empirical validation.
Findings
Logarithmic depth is necessary and sufficient for graph connectivity.
Single-layer transformers can solve contextual retrieval tasks.
Transformers outperform some specialized graph neural networks on graph reasoning.
Abstract
Which transformer scaling regimes are able to perfectly solve different classes of algorithmic problems? While tremendous empirical advances have been attained by transformer-based neural networks, a theoretical understanding of their algorithmic reasoning capabilities in realistic parameter regimes is lacking. We investigate this question in terms of the network's depth, width, and number of extra tokens for algorithm execution. Our novel representational hierarchy separates 9 algorithmic reasoning problems into classes solvable by transformers in different realistic parameter scaling regimes. We prove that logarithmic depth is necessary and sufficient for tasks like graph connectivity, while single-layer transformers with small embedding dimensions can solve contextual retrieval tasks. We also support our theoretical analysis with ample empirical evidence using the GraphQA benchmark.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Model-Driven Software Engineering Techniques · Advanced Software Engineering Methodologies
