Lookahead-then-Verify: Reliable Constrained Decoding for Diffusion LLMs under Context-Free Grammars

Yitong Zhang; Yongmin Li; Yuetong Liu; Jia Li; Xiaoran Jia; Zherui Li; Ge Li

arXiv:2602.00612·cs.CL·February 10, 2026

Lookahead-then-Verify: Reliable Constrained Decoding for Diffusion LLMs under Context-Free Grammars

Yitong Zhang, Yongmin Li, Yuetong Liu, Jia Li, Xiaoran Jia, Zherui Li, Ge Li

PDF

Open Access

TL;DR

This paper introduces LAVE, a novel constrained decoding method for diffusion large language models that ensures syntactic correctness by leveraging their parallel token prediction ability, significantly improving output validity.

Contribution

LAVE is the first constrained decoding approach tailored for dLLMs that reliably enforces grammatical correctness without significant runtime costs.

Findings

01

LAVE outperforms existing methods in syntactic correctness across multiple benchmarks.

02

LAVE maintains high efficiency with negligible runtime overhead.

03

LAVE demonstrates consistent improvements across four different dLLMs.

Abstract

Diffusion Large Language Models (dLLMs) have demonstrated promising generative capabilities and are increasingly used to produce formal languages defined by context-free grammars, such as source code and chemical expressions. However, as probabilistic models, they still struggle to generate syntactically valid outputs reliably. A natural and promising direction to address this issue is to adapt constrained decoding techniques to enforce grammatical correctness during generation. However, applying these techniques faces two primary obstacles. On the one hand, the non-autoregressive nature of dLLMs renders most existing constrained decoding approaches inapplicable. On the other hand, current approaches specifically designed for dLLMs may allow intermediate outputs that are impossible to complete into valid sentences, which significantly limits their reliability in practice. To address…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis