Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference

Yushi Ye; Feng Hong; Huangjie Zheng; Xu Chen; Zhiyong Chen; Yanfeng Wang; Jiangchao Yao

arXiv:2602.22868·cs.CL·February 27, 2026

Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference

Yushi Ye, Feng Hong, Huangjie Zheng, Xu Chen, Zhiyong Chen, Yanfeng Wang, Jiangchao Yao

PDF

Open Access

TL;DR

ReMix introduces a continuous refinement framework for diffusion large language models, significantly accelerating inference speed while maintaining quality by resolving semantic conflicts during decoding.

Contribution

It proposes ReMix, a novel, training-free method that uses continuous intermediate states and rejection rules to improve inference speed and quality in DLLMs.

Findings

01

Achieves 2-8x inference speedup without quality loss.

02

Effectively resolves semantic contradictions during decoding.

03

Demonstrates robustness across various experimental settings.

Abstract

Diffusion Large Language Models (DLLMs) promise fast non-autoregressive inference but suffer a severe quality-speed trade-off in parallel decoding. This stems from the ''combinatorial contradiction'' phenomenon, where parallel tokens form semantically inconsistent combinations. We address this by integrating continuous representations into the discrete decoding process, as they preserve rich inter-position dependency. We propose ReMix (Rejection Mixing), a framework that introduces a novel Continuous Mixing State as an intermediate between the initial masked state and the final decoded token state. This intermediate state allows a token's representation to be iteratively refined in a continuous space, resolving mutual conflicts with other tokens before collapsing into a final discrete sample. Furthermore, a rejection rule reverts uncertain representations from the continuous state back…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Topic Modeling