HIQL: Offline Goal-Conditioned RL with Latent States as Actions

Seohong Park; Dibya Ghosh; Benjamin Eysenbach; Sergey Levine

arXiv:2307.11949·cs.LG·March 12, 2024·2 cites

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces HIQL, a hierarchical goal-conditioned reinforcement learning algorithm that leverages latent states as actions to effectively learn from offline, reward-free data, especially for long-horizon tasks with high-dimensional observations.

Contribution

The paper presents a novel hierarchical approach that uses latent states as actions, enabling robust offline goal-conditioned RL from diverse data and improving long-horizon task performance.

Findings

01

Successfully solves long-horizon offline goal-reaching tasks.

02

Scales to high-dimensional image observations.

03

Utilizes action-free data effectively.

Abstract

Unsupervised pre-training has recently become the bedrock for computer vision and natural language processing. In reinforcement learning (RL), goal-conditioned RL can potentially provide an analogous self-supervised approach for making use of large quantities of unlabeled (reward-free) data. However, building effective algorithms for goal-conditioned RL that can learn directly from diverse offline data is challenging, because it is hard to accurately estimate the exact value function for faraway goals. Nonetheless, goal-reaching problems exhibit structure, such that reaching distant goals entails first passing through closer subgoals. This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals. Based on this idea, we propose a hierarchical algorithm for goal-conditioned RL from offline data. Using one…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seohongpark/hiql
jaxOfficial

Videos

HIQL: Offline Goal-Conditioned RL with Latent States as Actions· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification