Remask, Don't Replace: Token-to-Mask Refinement in Diffusion Large Language Models

Lin Yao

arXiv:2604.18738·cs.CL·May 8, 2026

Remask, Don't Replace: Token-to-Mask Refinement in Diffusion Large Language Models

Lin Yao

PDF

TL;DR

The paper introduces Token-to-Mask (T2M) remasking, a training-free method for improving diffusion large language models by resetting suspicious tokens to enhance self-correction during parallel token generation.

Contribution

Proposes T2M remasking as a novel, training-free approach to improve self-correction in diffusion large language models, outperforming token replacement methods.

Findings

01

T2M improves accuracy by +13.33 points on AIME 2025.

02

T2M improves accuracy by +8.56 points on CMATH.

03

Remasking suspect tokens is more reliable than overwriting in self-correction.

Abstract

Diffusion large language models (dLLMs) gain speed by committing multiple tokens in parallel at each denoising step, but any erroneous commitment persists as conditioning context and biases every subsequent prediction. LLaDA2.1 repairs such errors with Token-to-Token (T2T) editing, which re-examines previously unmasked tokens and overwrites them when an alternative becomes sufficiently confident. We argue that this replacement action is itself the limiting factor: under polluted context, a confident replacement can propagate the error, while under a multimodal posterior no alternative may be confident enough to trigger an edit. We propose Token-to-Mask (T2M) remasking, a training-free rule that revokes suspicious commitments by resetting them to [M] and lets the subsequent mask-filling steps re-predict them from a cleaner context. T2M improves accuracy by +13.33 points on AIME 2025 and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.