Diffusion Language Models Are Natively Length-Aware

Vittorio Rossi; Giacomo Cir\`o; Davide Beltrame; Luca Gandolfi; Paul R\"ottger; Dirk Hovy

arXiv:2603.06123·cs.CL·March 9, 2026

Diffusion Language Models Are Natively Length-Aware

Vittorio Rossi, Giacomo Cir\`o, Davide Beltrame, Luca Gandolfi, Paul R\"ottger, Dirk Hovy

PDF

Open Access

TL;DR

Diffusion Language Models can be made length-aware by estimating output length from latent prompts, enabling dynamic context cropping that reduces computation without harming performance.

Contribution

The paper introduces a zero-shot method to dynamically crop context in diffusion language models based on latent prompt information, improving efficiency.

Findings

01

Massive FLOPs reduction across tasks

02

No significant performance degradation

03

Performance improvement in some tasks

Abstract

Unlike autoregressive language models, which terminate variable-length generation upon predicting an End-of-Sequence (EoS) token, Diffusion Language Models (DLMs) operate over a fixed maximum-length context window for a predetermined number of denoising steps. However, this process is independent of the required response length, resulting in computational waste for the majority of short responses common in reasoning and chat tasks. To address this problem, we conjecture that the latent prompt representation contains sufficient information to estimate the required output length. We provide empirical evidence for this phenomenon and propose a zero-shot mechanism to dynamically crop the context window before generation begins, leading to fewer diffusion steps and substantial computational savings. We evaluate our approach on four benchmarks with diverse tasks -- GSM8K (reasoning),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications