Reasoning with Autoregressive-Diffusion Collaborative Thoughts
Mu Yuan, Liekang Zeng, Guoliang Xing, Lan Zhang, Yunhao Liu

TL;DR
This paper introduces Collaborative Thoughts, a unified framework that combines autoregressive and diffusion models through a closed-loop interaction to enhance reasoning, spatial understanding, and controllability in generative tasks.
Contribution
It proposes a novel collaborative framework enabling autoregressive and diffusion models to jointly reason and generate via a closed-loop interaction, improving spatial reasoning and control.
Findings
Enhanced spatial reasoning accuracy
Improved controllability in generation tasks
Effective error correction across modalities
Abstract
Autoregressive and diffusion models represent two complementary generative paradigms. Autoregressive models excel at sequential planning and constraint composition, yet struggle with tasks that require explicit spatial or physical grounding. Diffusion models, in contrast, capture rich spatial structure through high-dimensional generation, but lack the stepwise logical control needed to satisfy complex, multi-stage constraints or to reliably identify and correct errors. We introduce Collaborative Thoughts, a unified collaborative framework that enables autoregressive and diffusion models to reason and generate jointly through a closed-loop interaction. In Collaborative Thoughts, autoregressive models perform structured planning and constraint management, diffusion models instantiate these constraints as intermediate visual thoughts, and a vision-based critic module evaluates whether the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · AI-based Problem Solving and Planning · Constraint Satisfaction and Optimization
