Algorithmic Task Capture, Computational Complexity, and Inductive Bias of Infinite Transformers

Orit Davidovich; Zohar Ringel

arXiv:2603.11161·cs.LG·May 8, 2026

Algorithmic Task Capture, Computational Complexity, and Inductive Bias of Infinite Transformers

Orit Davidovich, Zohar Ringel

PDF

TL;DR

This paper defines algorithmic capture in transformers, analyzes their ability to extrapolate combinatorial tasks, and explores their inductive biases and computational complexity limits.

Contribution

It introduces a formal definition of algorithmic capture, derives complexity bounds for infinite transformers, and explains their bias towards simpler algorithms.

Findings

01

Transformers can extrapolate certain combinatorial tasks across large scales.

02

Despite universal expressivity, transformers favor simpler algorithms within polynomial time.

03

Empirical evidence shows both capture and non-capture across different scales.

Abstract

We formally define algorithmic capture of combinatorial tasks as the ability of a transformer to extrapolate to arbitrary task sizes with controllable error and logarithmic sample adaptation, providing a sharp scaling criterion for distinguishing logic internalization from statistical interpolation. Empirically, across scaling ranges spanning up to 2.5 orders of magnitude, we observe evidence of capture and non-capture. By analyzing infinite-width transformers in both the lazy and rich regimes, we derive upper bounds on the inference-time computational complexity of the combinatorial tasks these networks can capture. We show that, despite their universal expressivity, transformers possess an inductive bias that disfavors higher-complexity algorithmic procedures within the efficient polynomial-time heuristic scheme class, consistent with successful capture on simpler combinatorial tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.