Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search

Hao Lu; Haoyuan Huang; Yulin Zhou; Chen Li; Ningxin Zhu

arXiv:2602.04248·cs.AI·February 5, 2026

Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search

Hao Lu, Haoyuan Huang, Yulin Zhou, Chen Li, Ningxin Zhu

PDF

Open Access

TL;DR

Empirical-MCTS introduces a dual-loop framework that enhances reasoning in Large Language Models by combining local exploration with global memory optimization, enabling continuous learning from experience during inference.

Contribution

The paper presents Empirical-MCTS, a novel framework that transforms stateless search into a continuous learning process using pairwise feedback and global memory management.

Findings

01

Outperforms traditional stateless MCTS on complex reasoning benchmarks.

02

Demonstrates significant improvements in reasoning accuracy and efficiency.

03

Validates the importance of empirical experience accumulation in reasoning tasks.

Abstract

Inference-time scaling strategies, particularly Monte Carlo Tree Search (MCTS), have significantly enhanced the reasoning capabilities of Large Language Models (LLMs). However, current approaches remain predominantly stateless, discarding successful reasoning patterns after each problem instance and failing to mimic the empirical accumulation of wisdom characteristic of human problem-solving. To bridge this gap, we introduce Empirical-MCTS, a dual-loop framework that transforms stateless search into a continuous, non-parametric learning process. The framework unifies local exploration with global memory optimization through two novel mechanisms: Pairwise-Experience-Evolutionary Meta-Prompting (PE-EMP) and a Memory Optimization Agent. PE-EMP functions as a reflexive optimizer within the local search, utilizing pairwise feedback to dynamically synthesize adaptive criteria and evolve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConstraint Satisfaction and Optimization · Multimodal Machine Learning Applications · Natural Language Processing Techniques