Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models

Seunghwan Jang; SooJean Han

arXiv:2601.02799·cs.LG·February 18, 2026

Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models

Seunghwan Jang, SooJean Han

PDF

Open Access

TL;DR

This paper introduces Stratified Hazard Sampling, a novel inference method for discrete diffusion models that reduces variance and improves sample quality by stratifying event timing, applicable to CTMC/DTMC processes.

Contribution

It proposes a training-free, hyperparameter-free stratified sampling technique that minimizes variance in event scheduling for discrete diffusion models, enhancing sample quality and robustness.

Findings

01

SHS improves sample quality across models.

02

SHS enhances robustness under lexical constraints.

03

Variance is minimized with stratified event placement.

Abstract

Uniform-noise discrete diffusion and flow models (e.g., D3PM, SEDD, UDLM, DFM) generate sequences non-autoregressively by iteratively refining randomly initialized vocabulary tokens through multiple context-dependent replacements. These models are typically formulated as time-inhomogeneous CTMC/DTMC processes and sampled using independent Bernoulli change decisions at each discretization step. This induces Poisson-binomial variance in per-position jump counts that grows with the number of required edits, leading to the characteristic under-editing (residual noise) and over-editing (cascading substitutions) failure modes that degrade sample quality, especially under tight discretization budgets. In contrast, absorbing-state (mask-start) models avoid this instability by allowing each position to jump at most once. We propose Stratified Hazard Sampling (SHS), a training-free, drop-in, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Markov Chains and Monte Carlo Methods · Model Reduction and Neural Networks