Plan for Speed: Dilated Scheduling for Masked Diffusion Language Models

Omer Luxembourg; Haim Permuter; Eliya Nachmani

arXiv:2506.19037·cs.CL·May 14, 2026

Plan for Speed: Dilated Scheduling for Masked Diffusion Language Models

Omer Luxembourg, Haim Permuter, Eliya Nachmani

PDF

1 Repo

TL;DR

This paper introduces DUS, a dilated unmasking scheduler for masked diffusion language models, enabling faster, parallel token unmasking with predictable speedup and improved performance across various benchmarks.

Contribution

The paper proposes a novel, inference-only scheduler that partitions sequence positions into dilated groups for parallel unmasking, improving speed and quality trade-offs in MDLMs.

Findings

01

DUS achieves up to 5.8x speedup over token-by-token decoding.

02

DUS outperforms confidence-based planners across multiple benchmarks.

03

Dilated spacing enhances adaptive samplers when used as a post-filter.

Abstract

Masked diffusion language models (MDLMs) promise fast, non-autoregressive text generation, yet existing samplers, which pick tokens to unmask based on model confidence, ignore interactions when unmasking multiple positions in parallel and effectively reduce to slow, autoregressive behavior. We propose the Dilated Unmasking Scheduler (DUS), an inference-only, planner-model-free method that partitions sequence positions into non-adjacent dilated groups and unmasks them in parallel so as to minimize an upper bound on joint entropy gain at each denoising step. By explicitly trading off the number of network calls against generation quality, DUS recovers most of the performance lost under traditional parallel unmasking strategies. Across math (GSM8K, MATH500), code (HumanEval, MBPP), general-knowledge (BBH, MMLU-Pro), and instruction following (IFEval) benchmarks, DUS outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

omerlux/DUS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.