Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search
Hao Lu, Haoyuan Huang, Yulin Zhou, Chen Li, Ningxin Zhu

TL;DR
Empirical-MCTS introduces a dual-loop framework that enhances reasoning in Large Language Models by combining local exploration with global memory optimization, enabling continuous learning from experience during inference.
Contribution
The paper presents Empirical-MCTS, a novel framework that transforms stateless search into a continuous learning process using pairwise feedback and global memory management.
Findings
Outperforms traditional stateless MCTS on complex reasoning benchmarks.
Demonstrates significant improvements in reasoning accuracy and efficiency.
Validates the importance of empirical experience accumulation in reasoning tasks.
Abstract
Inference-time scaling strategies, particularly Monte Carlo Tree Search (MCTS), have significantly enhanced the reasoning capabilities of Large Language Models (LLMs). However, current approaches remain predominantly stateless, discarding successful reasoning patterns after each problem instance and failing to mimic the empirical accumulation of wisdom characteristic of human problem-solving. To bridge this gap, we introduce Empirical-MCTS, a dual-loop framework that transforms stateless search into a continuous, non-parametric learning process. The framework unifies local exploration with global memory optimization through two novel mechanisms: Pairwise-Experience-Evolutionary Meta-Prompting (PE-EMP) and a Memory Optimization Agent. PE-EMP functions as a reflexive optimizer within the local search, utilizing pairwise feedback to dynamically synthesize adaptive criteria and evolve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Multimodal Machine Learning Applications · Natural Language Processing Techniques
