Exploring the Edges of Latent State Clusters for Goal-Conditioned   Reinforcement Learning

Yuanlin Duan; Guofeng Cui; He Zhu

arXiv:2411.01396·cs.LG·November 5, 2024

Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning

Yuanlin Duan, Guofeng Cui, He Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces CE^2, a goal-directed exploration algorithm that uses clustering in latent space to prioritize frontier states, significantly improving exploration efficiency in complex robotics environments.

Contribution

The paper proposes a novel clustering-based goal selection method for exploration in goal-conditioned reinforcement learning, enhancing the ability to reach rare and frontier states.

Findings

01

CE^2 outperforms baseline methods in exploration efficiency.

02

Effective in complex robotics tasks like maze navigation and object manipulation.

03

Demonstrates superior exploration in challenging environments.

Abstract

Exploring unknown environments efficiently is a fundamental challenge in unsupervised goal-conditioned reinforcement learning. While selecting exploratory goals at the frontier of previously explored states is an effective strategy, the policy during training may still have limited capability of reaching rare goals on the frontier, resulting in reduced exploratory behavior. We propose "Cluster Edge Exploration" ( $C E^{2}$ ), a new goal-directed exploration algorithm that when choosing goals in sparsely explored areas of the state space gives priority to goal states that remain accessible to the agent. The key idea is clustering to group states that are easily reachable from one another by the current policy under training in a latent space and traversing to states holding significant exploration potential on the boundary of these clusters before doing exploratory behavior. In challenging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RU-Automated-Reasoning-Group/CE2
tfOfficial

Videos

Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning· slideslive

Taxonomy

TopicsData Stream Mining Techniques · Bayesian Modeling and Causal Inference · Advanced Text Analysis Techniques

MethodsPathways Language Model