HIQL: Offline Goal-Conditioned RL with Latent States as Actions
Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine

TL;DR
This paper introduces HIQL, a hierarchical goal-conditioned reinforcement learning algorithm that leverages latent states as actions to effectively learn from offline, reward-free data, especially for long-horizon tasks with high-dimensional observations.
Contribution
The paper presents a novel hierarchical approach that uses latent states as actions, enabling robust offline goal-conditioned RL from diverse data and improving long-horizon task performance.
Findings
Successfully solves long-horizon offline goal-reaching tasks.
Scales to high-dimensional image observations.
Utilizes action-free data effectively.
Abstract
Unsupervised pre-training has recently become the bedrock for computer vision and natural language processing. In reinforcement learning (RL), goal-conditioned RL can potentially provide an analogous self-supervised approach for making use of large quantities of unlabeled (reward-free) data. However, building effective algorithms for goal-conditioned RL that can learn directly from diverse offline data is challenging, because it is hard to accurately estimate the exact value function for faraway goals. Nonetheless, goal-reaching problems exhibit structure, such that reaching distant goals entails first passing through closer subgoals. This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals. Based on this idea, we propose a hierarchical algorithm for goal-conditioned RL from offline data. Using one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
