When Diffusion Breaks Constraints: Sequential Autoregressive Generation with RL and MCTS
Zirui Zhao, Boye Niu, Harold Soh, David Hsu, Wee Sun Lee

TL;DR
This paper identifies the limitations of diffusion models in constrained generation tasks and proposes a sequential autoregressive approach with reinforcement learning and MCTS to improve feasibility.
Contribution
It introduces a reformulation of constrained generation as discrete autoregressive sequential generation, addressing diffusion models' failure modes.
Findings
Diffusion models struggle with low-dimensional, constrained solution spaces.
Reinforcement learning improves the feasibility and success rate of constrained generation.
Monte Carlo tree search helps evaluate the value of look-ahead in shrinking feasible regions.
Abstract
Data-driven generative models excel in language and vision, but diffusion models often fail in constrained planning and design tasks, exhibiting severe constraint violations in engineering inverse design, molecular generation, multi-robot planning, and floorplan/scene synthesis even with projection or guidance. Such tasks combine hard-to-specify semantic goals with strict geometric or physical constraints (e.g., non-overlap, connectivity), yielding feasible solutions that lie on low-dimensional, small, and sometimes disconnected regions of the output space. This paper studies the failure mode through tangram generation from language, where seven fixed shapes must form a text-described silhouette while remaining connected and non-overlapping, and a simplified rectangle composition task with a learned bounding-box constraint. We find diffusion models struggle to satisfy constraints,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
