Discrete Diffusion Models Exploit Asymmetry to Solve Lookahead Planning Tasks
Itamar Trainin, Shauli Ravfogel, Omri Abend, Amir Feder

TL;DR
This paper compares autoregressive and non-autoregressive discrete diffusion models in lookahead planning tasks, revealing that NAR models exploit asymmetry to efficiently solve these tasks with less data and simpler architectures.
Contribution
It uncovers the mechanistic differences enabling NAR models to leverage asymmetry for planning, demonstrating their efficiency over AR models in lookahead tasks.
Findings
NAR models solve planning tasks by decoding backwards using future tokens.
NAR models require exponentially fewer training examples.
AR models often fail to converge without curriculum adjustments.
Abstract
While Autoregressive (AR) Transformer-based Generative Language Models are frequently employed for lookahead tasks, recent research suggests a potential discrepancy in their ability to perform planning tasks that require multi-step lookahead. In this work, we investigate the distinct emergent mechanisms that arise when training AR versus Non-Autoregressive (NAR) models, such as Discrete Diffusion Models (dLLMs), on lookahead tasks. By requiring the models to plan ahead to reach the correct conclusion, we analyze how these two paradigms fundamentally differ in their approach to the problem. We identify a critical asymmetry in planning problems: while forward generation requires complex lookahead at branching junctions, reverse generation is often deterministic. This asymmetry creates an opportunity for NAR models. Through mechanistic analysis of training and inference dynamics, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Artificial Intelligence in Games · Reinforcement Learning in Robotics
