Lookahead Path Likelihood Optimization for Diffusion LLMs
Xuejie Liu, Yap Vit Chun, Yitao Liang, Anji Liu

TL;DR
This paper introduces a new method for optimizing the unmasking order in diffusion large language models, leading to improved accuracy by predicting and selecting the most promising decoding paths.
Contribution
It proposes Path LL, a trajectory-conditioned objective, and POKE, an efficient value estimator, to enhance unmasking path selection during inference in diffusion LLMs.
Findings
Achieves 2-3% accuracy gains over baselines.
Improves the accuracy--compute Pareto frontier.
Enhances inference performance across 6 reasoning tasks.
Abstract
Diffusion Large Language Models (dLLMs) support arbitrary-order generation, yet their inference performance critically depends on the unmasking order. Existing strategies rely on heuristics that greedily optimize local confidence, offering limited guidance for identifying unmasking paths that are globally consistent and accurate. To bridge this gap, we introduce path log-likelihood (Path LL), a trajectory-conditioned objective that strongly correlates with downstream accuracy and enables principled selection of unmasking paths. To optimize Path LL at inference time, we propose POKE, an efficient value estimator that predicts the expected future Path LL of a partial decoding trajectory. We then integrate this lookahead signal into POKE-SMC, a Sequential Monte Carlo-based search framework for dynamically identifying optimal unmasking paths. Extensive experiments across 6 reasoning tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
