Breaking the Reward Barrier: Accelerating Tree-of-Thought Reasoning via Speculative Exploration
Shuzhang Zhong, Haochen Huang, Shengxuan Qiu, Pengfei Zuo, Runsheng Wang, Meng Li

TL;DR
This paper introduces SPEX, a set of techniques to accelerate Tree-of-Thought reasoning in LLMs by speculative exploration, achieving up to 4.1x speedup and improving scalability.
Contribution
SPEX employs speculative path selection, dynamic resource allocation, and early termination to enhance ToT reasoning efficiency and scalability.
Findings
SPEX achieves 1.2 to 3 times speedup across ToT algorithms.
Combining SPEX with token-level speculative decoding yields up to 4.1x speedup.
Ablation studies validate the effectiveness of each SPEX component.
Abstract
Tree-of-Thought (ToT) reasoning structures Large Language Model (LLM) inference as a tree-based search, demonstrating strong potential for solving complex mathematical and programming tasks. However, its efficiency is constrained by the reward dependency barrier -- a synchronization bottleneck caused by sequential reward-guided exploration that limits search parallelism and introduces substantial latency. Prior system optimizations, mainly designed for linear Chain-of-Thought (CoT) reasoning, cannot address these challenges, leaving the efficiency of ToT underexplored. To enhance ToT reasoning efficiency, we observe that the reasoning paths can be explored speculatively to break the reward synchronization barrier. Therefore, in this paper, we propose SPEX and introduce three key techniques: (i) intra-query speculative path selection to predict and expand high-potential branches of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
