DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas

Zirui Wu; Lin Zheng; Zhihui Xie; Jiacheng Ye; Jiahui Gao; Shansan Gong; Yansong Feng; Zhenguo Li; Wei Bi; Guorui Zhou; Lingpeng Kong

arXiv:2602.01326·cs.CL·February 3, 2026

DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas

Zirui Wu, Lin Zheng, Zhihui Xie, Jiacheng Ye, Jiahui Gao, Shansan Gong, Yansong Feng, Zhenguo Li, Wei Bi, Guorui Zhou, Lingpeng Kong

PDF

Open Access 3 Reviews

TL;DR

DreamOn introduces a diffusion framework for code infilling that allows dynamic, variable-length output generation, overcoming fixed-length constraints and matching state-of-the-art autoregressive models in performance.

Contribution

It presents a novel diffusion-based method with length control states enabling autonomous length adjustment without architectural changes.

Findings

01

Achieves state-of-the-art infilling performance on benchmarks.

02

Matches oracle performance with ground-truth length.

03

Enables flexible, variable-length code generation.

Abstract

Diffusion Language Models (DLMs) present a compelling alternative to autoregressive models, offering flexible, any-order infilling without specialized prompting design. However, their practical utility is blocked by a critical limitation: the requirement of a fixed-length masked sequence for generation. This constraint severely degrades code infilling performance when the predefined mask size mismatches the ideal completion length. To address this, we propose DreamOn, a novel diffusion framework that enables dynamic, variable-length generation. DreamOn augments the diffusion process with two length control states, allowing the model to autonomously expand or contract the output length based solely on its own predictions. We integrate this mechanism into existing DLMs with minimal modifications to the training objective and no architectural changes. Built upon Dream-Coder-7B and…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. This paper addresses the fixed-length mask, a major practical limitation for diffusion language models with `[expand]` and `[delete]` token solution is elegant, effective, and requires no architectural changes. 1. The paper has very nice presentation, with illustrative examples of the control tokens, and their effects. 1. The paper clearly validates the necessity of both the expansion and deletion mechanisms through detailed experiments and ablations. 1. The proposed method allows diffus

Weaknesses

- The method is presented as a general solution for Diffusion Language Models, but it is only evaluated on Python code infilling. It is unclear if the `[expand]` and `[delete]` logic would translate well to more fluid and creative natural language tasks. - The deletion broadcasting is an example of a "training-free" efficiency, but it seems like it would be better suited if this were learned in a data-driven manner.

Reviewer 02Rating 6Confidence 4

Strengths

- Addresses a critical and practical limitation of diffusion language models—fixed-length generation—with a minimal, plug-and-play solution. - Achieves strong empirical results, closing the performance gap with autoregressive models and matching oracle-length performance on standard infilling benchmarks. Table 1's performance is impressive. - Requires no architectural changes and integrates seamlessly into existing DLMs with only minor modifications to the training objective.

Weaknesses

- Clarity of Figure 2: The current illustration of the diffusion process in Figure 2 is hard to follow. It would be significantly clearer if the forward and backward processes were presented separately, with explicit depiction of how [expand] and [delete] tokens interact with the sequence during denoising. - Inconsistent or Missing Baselines: The paper mentions evaluating LLaDA (line 305) but does not include it in any experimental results (e.g., Table 1 or 2), making it difficult to assess comp

Reviewer 03Rating 2Confidence 3

Strengths

1. The paper proposes a clear mechanism (expand/delete sentinel states) that gives masked diffusion models native variable-length control without changing model architecture. 2. Empirical results show strong improvements on code-infilling benchmarks, achieving near-oracle accuracy and robust generation compared to prior diffusion and autoregressive baselines.

Weaknesses

1. I feel the baselines are weak — simply plugging this into a DLM will almost certainly yield improvements. You should compare against other methods applied to DLMs to see whether DREAMON still has an advantage. Or are you the first to implement dynamic-length generation on diffusion language models? 2. Claiming in Contribution 1 and 2 that you’ve solved the fixed-length bottleneck of DLMs feels overstated, because the validation is only on code infilling. If you want to make that claim, you n

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Topic Modeling