Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
Yuanlin Duan, Guofeng Cui, He Zhu

TL;DR
This paper introduces CE^2, a goal-directed exploration algorithm that uses clustering in latent space to prioritize frontier states, significantly improving exploration efficiency in complex robotics environments.
Contribution
The paper proposes a novel clustering-based goal selection method for exploration in goal-conditioned reinforcement learning, enhancing the ability to reach rare and frontier states.
Findings
CE^2 outperforms baseline methods in exploration efficiency.
Effective in complex robotics tasks like maze navigation and object manipulation.
Demonstrates superior exploration in challenging environments.
Abstract
Exploring unknown environments efficiently is a fundamental challenge in unsupervised goal-conditioned reinforcement learning. While selecting exploratory goals at the frontier of previously explored states is an effective strategy, the policy during training may still have limited capability of reaching rare goals on the frontier, resulting in reduced exploratory behavior. We propose "Cluster Edge Exploration" (), a new goal-directed exploration algorithm that when choosing goals in sparsely explored areas of the state space gives priority to goal states that remain accessible to the agent. The key idea is clustering to group states that are easily reachable from one another by the current policy under training in a latent space and traversing to states holding significant exploration potential on the boundary of these clusters before doing exploratory behavior. In challenging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Stream Mining Techniques · Bayesian Modeling and Causal Inference · Advanced Text Analysis Techniques
MethodsPathways Language Model
