Beyond Static Cutoffs: One-Shot Dynamic Thresholding for Diffusion Language Models
Jucheng Shen, Yeonju Ro

TL;DR
This paper introduces One-Shot Dynamic Thresholding (OSDT), a novel method for diffusion language models that calibrates confidence thresholds on a single sequence to improve decoding speed and accuracy across various tasks.
Contribution
The paper proposes OSDT, a new dynamic thresholding approach that adapts to each sequence, outperforming static thresholds in speed and accuracy on multiple benchmarks.
Findings
OSDT achieves up to 50% faster token decoding.
OSDT improves accuracy-throughput trade-offs on benchmark datasets.
Reusable confidence signatures enable broader system innovations.
Abstract
Masked diffusion language models (MDLMs) are becoming competitive with their autoregressive counterparts but typically decode with fixed steps and sequential unmasking. To accelerate decoding, recent work such as Fast-dLLM enables parallel decoding via a static global confidence threshold, yet we observe strong block- and step-wise confidence fluctuations and, within a dataset, near-identical confidence trajectories across inputs as measured by cosine similarity. Motivated by these observations, we introduce One-Shot Dynamic Thresholding (OSDT), which calibrates thresholds on a single sequence and applies them to subsequent inputs with negligible overhead. On GPQA, GSM8K, and HumanEval, OSDT attains superior accuracy-throughput trade-offs (+24% tokens/s on GSM8K at the best accuracy, +45% on GPQA with comparable accuracy, and +50% on HumanEval with a modest accuracy gap). Beyond these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Language and cultural evolution
