KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim; Sunwoo Hong; Hojung Jung; Youngrok Park; Se-Young Yun

arXiv:2511.05664·cs.LG·March 9, 2026

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim, Sunwoo Hong, Hojung Jung, Youngrok Park, Se-Young Yun

PDF

Open Access

TL;DR

KLASS is a novel sampling method for masked diffusion models that uses token-level KL divergence to enable faster inference without retraining, achieving significant speedups and improved quality across multiple domains.

Contribution

Introduces KLASS, a KL-guided sampling technique that accelerates inference in masked diffusion models by unmasking multiple tokens based on stability, without additional training.

Findings

01

Achieves up to 2.78x speedup in reasoning tasks.

02

Maintains or improves sample quality over standard methods.

03

Effective across text, image, and molecular generation.

Abstract

Masked diffusion models have demonstrated competitive results on various tasks including language generation. However, due to its iterative refinement process, the inference is often bottlenecked by slow and static sampling speed. To overcome this problem, we introduce `KL-Adaptive Stability Sampling' (KLASS), a fast yet effective sampling method that exploits token-level KL divergence to identify stable, high-confidence predictions. By unmasking multiple tokens in each iteration without any additional model training, our approach speeds up generation significantly while maintaining sample quality. On reasoning benchmarks, KLASS achieves up to $2.78 \times$ wall-clock speedups while improving performance over standard greedy decoding, attaining state-of-the-art results among diffusion-based samplers. We further validate KLASS across diverse domains, including text, image, and molecular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Computational and Text Analysis Methods