Look Before You Leap: Autonomous Exploration for LLM Agents
Ziang Ye, Wentao Shi, Yuxin Liu, Yu Wang, Zhengzhou Cai, Yaorui Shi, Qi Gu, Xunliang Cai, Fuli Feng

TL;DR
This paper emphasizes the importance of autonomous exploration in large language model agents, introducing a new metric and training paradigm to improve their adaptability and generalization in unfamiliar environments.
Contribution
It introduces Exploration Checkpoint Coverage as a verifiable metric and proposes an Explore-then-Act paradigm to enhance exploration capabilities.
Findings
Agents trained with the new strategy show broader discovery of environment features.
Systematic exploration improves downstream task performance.
Decoupling exploration from exploitation enhances agent adaptability.
Abstract
Large language model based agents often fail in unfamiliar environments due to premature exploitation: a tendency to act on prior knowledge before acquiring sufficient environment-specific information. We identify autonomous exploration as a critical yet underexplored capability for building adaptive agents. To formalize and quantify this capability, we introduce Exploration Checkpoint Coverage, a verifiable metric that measures how broadly an agent discovers key states, objects, and affordances. Our systematic evaluation reveals that agents trained with standard task-oriented reinforcement learning consistently exhibit narrow and repetitive behaviors that impede downstream performance. To address this limitation, we develop a training strategy that interleaves task-execution rollouts and exploration rollouts, with each type of rollout optimized by its corresponding verifiable reward.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
