Algorithmic Task Capture, Computational Complexity, and Inductive Bias of Infinite Transformers
Orit Davidovich, Zohar Ringel

TL;DR
This paper defines algorithmic capture in transformers, analyzes their ability to extrapolate combinatorial tasks, and explores their inductive biases and computational complexity limits.
Contribution
It introduces a formal definition of algorithmic capture, derives complexity bounds for infinite transformers, and explains their bias towards simpler algorithms.
Findings
Transformers can extrapolate certain combinatorial tasks across large scales.
Despite universal expressivity, transformers favor simpler algorithms within polynomial time.
Empirical evidence shows both capture and non-capture across different scales.
Abstract
We formally define algorithmic capture of combinatorial tasks as the ability of a transformer to extrapolate to arbitrary task sizes with controllable error and logarithmic sample adaptation, providing a sharp scaling criterion for distinguishing logic internalization from statistical interpolation. Empirically, across scaling ranges spanning up to 2.5 orders of magnitude, we observe evidence of capture and non-capture. By analyzing infinite-width transformers in both the lazy and rich regimes, we derive upper bounds on the inference-time computational complexity of the combinatorial tasks these networks can capture. We show that, despite their universal expressivity, transformers possess an inductive bias that disfavors higher-complexity algorithmic procedures within the efficient polynomial-time heuristic scheme class, consistent with successful capture on simpler combinatorial tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
