Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning

Justin Lovelace; Christian Belardi; Sofian Zalouk; Adhitya Polavaram; Srivatsa Kundurthy; Kilian Q. Weinberger

arXiv:2602.20528·cs.CL·February 25, 2026

Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning

Justin Lovelace, Christian Belardi, Sofian Zalouk, Adhitya Polavaram, Srivatsa Kundurthy, Kilian Q. Weinberger

PDF

Open Access

TL;DR

STAR-LDM introduces a novel language model that combines latent diffusion planning with autoregressive generation, enabling global semantic planning and improved performance on understanding benchmarks and controllability.

Contribution

It presents a new language modeling approach integrating diffusion-based planning with autoregression, enhancing global coherence and controllability.

Findings

01

Outperforms similar-sized models on language understanding benchmarks.

02

Achieves over 70% win rates in narrative coherence and reasoning evaluations.

03

Enables attribute control without retraining, with better fluency-control trade-offs.

Abstract

The Stop-Think-AutoRegress Language Diffusion Model (STAR-LDM) integrates latent diffusion planning with autoregressive generation. Unlike conventional autoregressive language models limited to token-by-token decisions, STAR-LDM incorporates a "thinking" phase that pauses generation to refine a semantic plan through diffusion before continuing. This enables global planning in continuous space prior to committing to discrete tokens. Evaluations show STAR-LDM significantly outperforms similar-sized models on language understanding benchmarks and achieves $> 70%$ win rates in LLM-as-judge comparisons for narrative coherence and commonsense reasoning. The architecture also allows straightforward control through lightweight classifiers, enabling fine-grained steering of attributes without model retraining while maintaining better fluency-control trade-offs than specialized approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Multimodal Machine Learning Applications