Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding
Xiangzhong Luo, Yilin An, Zhicheng Yu, Weichen Liu, Xu Yang

TL;DR
This paper introduces DiCo, an adaptive parallel decoding method for diffusion-based large language models that significantly accelerates inference by exploiting inherent parallelism while preserving generation quality.
Contribution
The paper proposes a novel divide-and-conquer decoding paradigm, DiCo, that bridges the gap between theoretical parallelism and practical performance in diffusion-based LLMs.
Findings
DiCo achieves substantial inference speedups.
DiCo maintains competitive generation quality.
The method effectively explores and utilizes model parallelism.
Abstract
Diffusion-based large language models (dLLMs) have shown promising performance across various reasoning tasks, establishing themselves as an alternative to autoregressive large language models (LLMs). Unlike autoregressive LLMs that generate one token per step based on all previous tokens, dLLMs theoretically enable parallel generation of multiple tokens at each decoding step. However, recent dLLMs still favor one-token-per-step generation in practice, as directly decoding multiple masked tokens often leads to degraded generation quality and stability. This reveals a substantial gap between the theoretical parallelism and practical performance of dLLMs. To bridge this gap, we introduce an adaptive parallel decoding approach, namely DiCo, which features a three-phase divide-and-conquer paradigm to unleash the inherent parallelism of dLLMs. During the Divide phase, DiCo first explores the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Multimodal Machine Learning Applications
