Look Before You Leap: Autonomous Exploration for LLM Agents

Ziang Ye; Wentao Shi; Yuxin Liu; Yu Wang; Zhengzhou Cai; Yaorui Shi; Qi Gu; Xunliang Cai; Fuli Feng

arXiv:2605.16143·cs.AI·May 18, 2026

Look Before You Leap: Autonomous Exploration for LLM Agents

Ziang Ye, Wentao Shi, Yuxin Liu, Yu Wang, Zhengzhou Cai, Yaorui Shi, Qi Gu, Xunliang Cai, Fuli Feng

PDF

TL;DR

This paper emphasizes the importance of autonomous exploration in large language model agents, introducing a new metric and training paradigm to improve their adaptability and generalization in unfamiliar environments.

Contribution

It introduces Exploration Checkpoint Coverage as a verifiable metric and proposes an Explore-then-Act paradigm to enhance exploration capabilities.

Findings

01

Agents trained with the new strategy show broader discovery of environment features.

02

Systematic exploration improves downstream task performance.

03

Decoupling exploration from exploitation enhances agent adaptability.

Abstract

Large language model based agents often fail in unfamiliar environments due to premature exploitation: a tendency to act on prior knowledge before acquiring sufficient environment-specific information. We identify autonomous exploration as a critical yet underexplored capability for building adaptive agents. To formalize and quantify this capability, we introduce Exploration Checkpoint Coverage, a verifiable metric that measures how broadly an agent discovers key states, objects, and affordances. Our systematic evaluation reveals that agents trained with standard task-oriented reinforcement learning consistently exhibit narrow and repetitive behaviors that impede downstream performance. To address this limitation, we develop a training strategy that interleaves task-execution rollouts and exploration rollouts, with each type of rollout optimized by its corresponding verifiable reward.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.