Drifting Objectives for Refining Discrete Diffusion Language Models
Daisuke Oba, Hiroki Furuta, Naoaki Okazaki

TL;DR
This paper introduces TokenDrift, a novel drifting objective for discrete diffusion language models that improves text generation quality by leveraging anti-symmetric drifting in a semantic space.
Contribution
The paper adapts drifting principles to DDLMs by formulating TokenDrift, enabling better training and generation quality in discrete text models.
Findings
TokenDrift reduces perplexity by 89% at 4 NFEs on MDLM.
It improves generation quality over baseline models in controlled experiments.
The approach demonstrates practical refinement for DDLMs.
Abstract
Discrete diffusion language models (DDLMs) generate text by iteratively denoising categorical token sequences, while recent drifting methods for continuous generators suggest that part of this sampling-time correction can instead be absorbed into training through an anti-symmetric fixed-point objective. We study how to transfer this principle to DDLMs, where the main challenge is the interface with discrete text: hard token samples are non-differentiable, and categorical predictions do not directly provide continuous samples to drift. We formulate TokenDrift, a drifting objective that lifts categorical predictions to soft-token features, applies anti-symmetric drifting in a frozen semantic space, and backpropagates the resulting stop-gradient feature target to DDLM logits. In controlled continual-training experiments with masked and uniform-state diffusion backbones, TokenDrift improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
