Constrained Decoding of Diffusion LLMs with Context-Free Grammars

Niels M\"undler; Jasper Dekoninck; Martin Vechev

arXiv:2508.10111·cs.LG·August 18, 2025

Constrained Decoding of Diffusion LLMs with Context-Free Grammars

Niels M\"undler, Jasper Dekoninck, Martin Vechev

PDF

4 Datasets 3 Reviews

TL;DR

This paper introduces the first constrained decoding method for diffusion-based large language models that ensures outputs adhere to formal languages like C++ and JSON, improving syntactic correctness in practical applications.

Contribution

It presents a novel constrained decoding approach for diffusion LLMs using context-free grammars, addressing a gap in handling formal language constraints.

Findings

01

Achieves near-perfect syntactic correctness in code and data generation.

02

Maintains or improves functional correctness with constrained decoding.

03

Ensures practical computational efficiency.

Abstract

Large language models (LLMs) have shown promising performance across diverse domains. Many practical applications of LLMs, such as code completion and structured data extraction, require adherence to syntactic constraints specified by a formal language. Yet, due to their probabilistic nature, LLM output is not guaranteed to adhere to such formal languages. Prior work has proposed constrained decoding as a means to restrict LLM generation to particular formal languages. However, existing works are not applicable to the emerging paradigm of diffusion LLMs, when used in practical scenarios such as the generation of formally correct C++ or JSON output. In this paper we address this challenge and present the first constrained decoding method for diffusion models, one that can handle formal languages captured by context-free grammars. We begin by reducing constrained decoding to the more…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

The research question that this paper studied is timely and interesting.

Weaknesses

Motivation and benefit: While the paper successfully enforces syntactic validity in diffusion language models, it remains unclear whether this constraint leads to better semantic or functional outputs. Improving syntax alone doesn’t necessarily improve model accuracy or usefulness, so it would help to clarify when grammatical correctness translates to real task gains and when it simply bounds decoding behavior. Presentation and flow: The presentation of the core algorithm (Sec. 3) feels somewha

Reviewer 02Rating 6Confidence 3

Strengths

- It proposes the first constrained decoding method applicable to Diffusion Language Models (DLMs), filling the gap in existing technologies that fail to constrain DLMs using Context-Free Grammars (CFGs). Meanwhile, it naturally supports the previously unsolved Multi-Region Infilling (MRI) scenario, breaking through the limitation that traditional constrained decoding can only be applied to left-to-right Prefix generation (PRE) or simple Fill-In-the-Middle (FIM). - It achieves the first implemen

Weaknesses

- In practice, models are limited by the number of tokens, which may lead to failure in meeting syntactic constraints (such as unclosed parentheses and incomplete molecular structures) and leave some residual syntactic errors. Currently, there is a lack of efficient solutions for accurately modeling the number of remaining tokens. - In the lexing phase, if there are a large number of ambiguous terminal sequences, even though optimization via a "unified NFA" is applied, the risk of combinatorial

Reviewer 03Rating 8Confidence 5

Strengths

* The paper is the first work in ensuring CFG-constrained generation with diffusion LLMs.  * The paper is well-written and easy to follow. The formalism is solid, and the problem is presented with great detail.   * The paper addressed a challenging technical problem. Additionally, there are several non-trivial technical contributions such as heuristics to reduce the size of the normalized CFG.   * The empirical results are consistently strong, showing syntactical and. Functional improvement.

Weaknesses

* The MRI task is not natural. Removing the arbitrary character spans is not a realistic scenario in which one would expect to use an LLM. A more realistic code will remove semantically meaningful parts of the code.   * The overhead of constraining can be large in some cases

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.