Constrained Code Generation with Discrete Diffusion
Lize Shao, Michael Cardei, Zichen Xie, Ferdinando Fioretto, Wenxi Wang

TL;DR
This paper introduces Constrained Diffusion for Code (CDC), a novel framework that integrates constraint satisfaction into discrete diffusion models to improve code generation quality and feasibility.
Contribution
CDC is a training-free neurosymbolic inference method that enhances discrete diffusion models with constraint-aware denoising operators for better code generation.
Findings
CDC improves constraint satisfaction in code generation benchmarks.
CDC outperforms baseline models in functional correctness, security, and syntax.
CDC achieves these improvements with less corrective computation and localized edits.
Abstract
Discrete diffusion models are a powerful, emerging paradigm for code generation. They construct programs through iterative refinement of partially corrupted token sequences and enable parallel token refinement. Importantly, this paradigm exposes a global program state at each denoising step, which provides a natural intervention point for enforcing program-level functionality and security constraints, guiding the generation before the final code is committed. Building on this observation, the paper introduces Constrained Diffusion for Code (CDC), a training-free neurosymbolic inference framework that integrates constraint satisfaction directly into the reverse denoising process. CDC augments the base discrete diffusion sampler with constraint-aware denoising operators that combine mathematical optimization with program analysis to identify constraint-relevant regions of the intermediate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
