Implicit Search via Discrete Diffusion: A Study on Chess
Jiacheng Ye, Zhenyu Wu, Jiahui Gao, Zhiyong Wu, Xin Jiang, Zhenguo Li,, Lingpeng Kong

TL;DR
This paper introduces DiffuSearch, an implicit search method using discrete diffusion modeling, which enhances chess-playing AI by outperforming traditional explicit search techniques in accuracy and puzzle-solving.
Contribution
The paper presents DiffuSearch, a novel implicit search approach that leverages discrete diffusion to improve long-term planning in chess without explicit search algorithms.
Findings
DiffuSearch outperforms one-step policies by 19.2%.
DiffuSearch exceeds MCTS-enhanced policies by 14%.
DiffuSearch improves puzzle-solving by 30% and increases game strength by 540 Elo.
Abstract
In the post-AlphaGo era, there has been a renewed interest in search techniques such as Monte Carlo Tree Search (MCTS), particularly in their application to Large Language Models (LLMs). This renewed attention is driven by the recognition that current next-token prediction models often lack the ability for long-term planning. Is it possible to instill search-like abilities within the models to enhance their planning abilities without relying on explicit search? We propose DiffuSearch , a model that does \textit{implicit search} by looking into the future world via discrete diffusion modeling. We instantiate DiffuSearch on a classical board game, Chess, where explicit search is known to be essential. Through extensive controlled experiments, we show DiffuSearch outperforms both the searchless and explicit search-enhanced policies. Specifically, DiffuSearch outperforms the one-step policy…
Peer Reviews
Decision·ICLR 2025 Poster
The paper compares the proposed approach to other approaches
The resulting Chess program is very weak compared to the current state of the art. Lc0 or Stoofvless use MCTS and deep neural networks and have a Elo greater than 3500 according to the Swedish Rating List. This is far above the 1728 Elo of DiffuSearch.
Overall, the idea of this work is novel and intuitive. The paper is well-written and easy-to-follow. The technical contents like the theorems as well as proofs are well-organized and sound. The experimental setup is detailed and clear. The empirical results substantiate that DiffuSearch outperforms the existing baselines. The demonstration plots like Figures 4 and 5 are very clear and intuitive.
Line 83. The link of source code is invalid. I hope to see the code during the rebuttal. The authors may consider submitting via a zip file or providing a valid link to an anonymous repo. I would expect more detailed explanation of Figure 1 in both the main texts and the caption of Figure 1. As the comparison between the explicit and implicit searches is the main idea of this work. For example, the authors could explain the difference of explicit and implicit searches by describing Figure 1 mo
The paper investigates whether diffusion modeling can be helpful for emulating search using a feedforward network. This is an interesting question, as transformers have generally struggled thus far to solve problems requiring search. Normally, the solution is to add explicit search in the form of MCTS using the outputs of the transformer. This paper tries to improve the policy network instead, and indeed provides some evidence that transformers can simulate search using search with a single forw
1. I wish it were more clear what exactly the paper is arguing. The paper seems to provide evidence for the following claim: "If we must use transformers in tasks that require search, it is more efficient and effective to train them via diffusion to do implicit search than it is to add explicit search via MCTS." However, it seems as though the paper is instead arguing that implicit search is better than explicit search. There doesn't seem to be evidence for this, as the method still relies on a
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
MethodsSoftmax · Attention Is All You Need · Diffusion
