Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models

Shun Zou; Yong Wang; Zehui Chen; Lin Chen; Chongyang Tao; Feng Zhao; Xiangxiang Chu

arXiv:2604.08964·cs.CL·April 13, 2026

Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models

Shun Zou, Yong Wang, Zehui Chen, Lin Chen, Chongyang Tao, Feng Zhao, Xiangxiang Chu

PDF

TL;DR

This paper introduces Anchor-based History-stable Decoding (AHD), a dynamic, training-free decoding strategy for diffusion large language models that enhances efficiency and performance by monitoring token stability in real time.

Contribution

The paper presents AHD, a novel decoding method that leverages real-time stability monitoring to improve decoding speed and accuracy without additional training.

Findings

01

AHD reduces decoding steps by 80% on the BBH benchmark.

02

AHD improves performance by 3.67% on the BBH benchmark.

03

AHD reverses performance degradation in existing acceleration strategies.

Abstract

Diffusion Large Language Models (dLLMs) have recently become a promising alternative to autoregressive large language models (ARMs). Semi-autoregressive (Semi-AR) decoding is widely employed in base dLLMs and advanced decoding strategies due to its superior performance. However, our observations reveal that Semi-AR decoding suffers from inherent block constraints, which cause the decoding of many cross-block stable tokens to be unnecessarily delayed. To address this challenge, we systematically investigate the identification of stable tokens and present three key findings: (1) naive lookahead decoding is unreliable, (2) token stability closely correlates with convergence trend, and (3) historical information is isolated. Building on these insights, we propose Anchor-based History-stable Decoding (AHD), a training-free, plug-and-play dynamic decoding strategy. Specifically, AHD monitors…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.