On the Ability of Transformers to Verify Plans

Yash Sarrof; Yupei Du; Katharina Stein; Alexander Koller; Sylvie Thi\'ebaux; Michael Hahn

arXiv:2603.19954·cs.AI·March 23, 2026

On the Ability of Transformers to Verify Plans

Yash Sarrof, Yupei Du, Katharina Stein, Alexander Koller, Sylvie Thi\'ebaux, Michael Hahn

PDF

Open Access

TL;DR

This paper investigates the capacity of transformer models to verify plans in classical AI planning, providing theoretical guarantees for length generalization and validating findings through empirical experiments.

Contribution

It introduces C*-RASP, a novel extension of C-RASP, to analyze length and vocabulary size generalization in transformers for plan verification.

Findings

01

Transformers can provably verify long plans in certain classical planning domains.

02

Structural properties of planning problems influence the learnability of length generalizable solutions.

03

Empirical results support the theoretical guarantees for length generalization.

Abstract

Transformers have shown inconsistent success in AI planning tasks, and theoretical understanding of when generalization should be expected has been limited. We take important steps towards addressing this gap by analyzing the ability of decoder-only models to verify whether a given plan correctly solves a given planning instance. To analyse the general setting where the number of objects -- and thus the effective input alphabet -- grows at test time, we introduce C*-RASP, an extension of C-RASP designed to establish length generalization guarantees for transformers under the simultaneous growth in sequence length and vocabulary size. Our results identify a large class of classical planning domains for which transformers can provably learn to verify long plans, and structural properties that significantly affects the learnability of length generalizable solutions. Empirical experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Artificial Intelligence in Games · Reinforcement Learning in Robotics