Discrete Stochastic Localization for Non-autoregressive Generation

Yunshu Wu; Jiayi Cheng; Longxuan Yu; Partha Thakuria; Rob Brekelmans; Evangelos E. Papalexakis; Greg Ver Steeg

arXiv:2605.12836·cs.LG·May 22, 2026

Discrete Stochastic Localization for Non-autoregressive Generation

Yunshu Wu, Jiayi Cheng, Longxuan Yu, Partha Thakuria, Rob Brekelmans, Evangelos E. Papalexakis, Greg Ver Steeg

PDF

TL;DR

This paper introduces Discrete Stochastic Localization (DSL), a novel framework for discrete sequence generation that improves upon masked discrete diffusion models by supporting flexible SNR paths and enhancing distributional faithfulness.

Contribution

The authors propose DSL, a continuous-state framework with invariant denoising, enabling a single trained model to support various SNR paths and improve discrete sequence generation.

Findings

01

Fine-tuning with DSL improves distributional faithfulness (MAUVE) on OpenWebText.

02

A single checkpoint supports multiple sampling methods, including autoregressive and hybrid approaches.

03

The method achieves effective sampling with as few as 48 steps without retraining.

Abstract

Continuous diffusion is a natural framework for non-autoregressive generation but has generally lagged behind masked discrete diffusion models (MDMs) on discrete sequence generation. We argue that the bottleneck is not continuity itself, but a representation in which denoising depends on timestep-indexed noise regimes. We introduce \emph{Discrete Stochastic Localization} (DSL), a continuous-state framework with unit-sphere token embeddings whose Bayes-optimal denoiser is invariant to the nominal signal-to-noise ratio (SNR) under the localization channel. One trained network then supports an entire family of per-token SNR paths, with endpoint masked-diffusion paths as a special case. Fine-tuning a pretrained MDLM checkpoint with DSL substantially improves distributional faithfulness (MAUVE) on OpenWebText across all step budgets from $T = 128$ to $T = 1024$ , and the same checkpoint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.