IDLM: Inverse-distilled Diffusion Language Models

David Li; Nikita Gushchin; Dmitry Abulkhanov; Eric Moulines; Ivan Oseledets; Maxim Panov; Alexander Korotin

arXiv:2602.19066·cs.LG·February 24, 2026

IDLM: Inverse-distilled Diffusion Language Models

David Li, Nikita Gushchin, Dmitry Abulkhanov, Eric Moulines, Ivan Oseledets, Maxim Panov, Alexander Korotin

PDF

Open Access

TL;DR

This paper introduces IDLM, a novel method extending inverse distillation to discrete diffusion language models, significantly accelerating inference while maintaining quality, and overcoming theoretical and practical challenges in the process.

Contribution

The paper develops a theoretically sound and practically stable inverse distillation technique for discrete diffusion language models, enabling 4x-64x faster inference.

Findings

01

Reduces inference steps by up to 64 times

02

Maintains entropy and perplexity comparable to teacher models

03

Provides theoretical guarantees for unique solutions

Abstract

Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. To address this, we extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. Nonetheless, this extension introduces both theoretical and practical challenges. From a theoretical perspective, the inverse distillation objective lacks uniqueness guarantees, which may lead to suboptimal solutions. From a practical standpoint, backpropagation in the discrete space is non-trivial and often unstable. To overcome these challenges, we first provide a theoretical result demonstrating that our inverse formulation admits a unique solution, thereby ensuring valid optimization. We then introduce gradient-stable relaxations to support effective training. As…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Generative Adversarial Networks and Image Synthesis