ExACT: Teaching AI Agents to Explore with Reflective-MCTS and   Exploratory Learning

Xiao Yu; Baolin Peng; Vineeth Vajipey; Hao Cheng; Michel Galley,; Jianfeng Gao; Zhou Yu

arXiv:2410.02052·cs.CL·March 3, 2025

ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning

Xiao Yu, Baolin Peng, Vineeth Vajipey, Hao Cheng, Michel Galley,, Jianfeng Gao, Zhou Yu

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces ExACT, combining Reflective MCTS and Exploratory Learning to improve AI agents' exploration and decision-making in complex environments, achieving significant performance gains and efficient transfer of knowledge.

Contribution

The paper presents a novel approach integrating Reflective MCTS and Exploratory Learning to enhance AI agent exploration and learning at inference time, with demonstrated improvements on a challenging benchmark.

Findings

01

R-MCTS improves exploration efficiency through contrastive reflection.

02

Exploratory Learning enables agents to search and evaluate without external algorithms.

03

Agents trained with test-time search knowledge match high-performance benchmarks.

Abstract

Autonomous agents have demonstrated significant potential in automating complex multistep decision-making tasks. However, even state-of-the-art vision-language models (VLMs), such as GPT-4o, still fall short of human-level performance, particularly in intricate web environments and long-horizon tasks. To address these limitations, we present ExACT, an approach to combine test-time search and self-learning to build o1-like models for agentic applications. We first introduce Reflective Monte Carlo Tree Search (R-MCTS), a novel test time algorithm designed to enhance AI agents' ability to explore decision space on the fly. R-MCTS extends traditional MCTS by 1) incorporating contrastive reflection, allowing agents to learn from past interactions and dynamically improve their search efficiency; and 2) using multi-agent debate for reliable state evaluation. Next, we introduce Exploratory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Agent-E3/ExACT
none

Datasets

Columbia-NLP/ExACT-VWA
dataset· 7 dl
7 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning

MethodsSelf-Learning