Unlocking Prompt Infilling Capability for Diffusion Language Models

Yoshinari Fujinuma; Keisuke Sakaguchi

arXiv:2604.03677·cs.CL·April 7, 2026

Unlocking Prompt Infilling Capability for Diffusion Language Models

Yoshinari Fujinuma, Keisuke Sakaguchi

PDF

2 Models

TL;DR

This paper demonstrates that by extending supervised fine-tuning to full-sequence masking, diffusion language models can effectively perform prompt infilling, surpassing manual templates and enhancing transferability.

Contribution

The authors introduce a training modification enabling diffusion language models to perform prompt infilling, revealing that training practices, not architecture, limit this capability.

Findings

01

Infilled prompts match or outperform manual templates.

02

Model infilling transfers effectively across different models.

03

Full-sequence masking unlocks prompt infilling in diffusion language models.

Abstract

Masked diffusion language models (dLMs) generate text through bidirectional denoising, yet this capability remains locked for infilling prompts. This limitation is an artifact of the current supervised finetuning (SFT) convention of applying response-only masking. To unlock this capability, we extend full-sequence masking during SFT, where both prompts and responses are masked jointly. Once unlocked, the model infills masked portions of a prompt template conditioned on few-shot examples. We show that such model-infilled prompts match or surpass manually designed templates, transfer effectively across models, and are complementary to existing prompt optimization methods. Our results suggest that training practices, not architectural limitations, are the primary bottleneck preventing masked diffusion language models from infilling effective prompts

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.