Balancing Exploration and Exploitation in Hierarchical Reinforcement   Learning via Latent Landmark Graphs

Qingyang Zhang; Yiming Yang; Jingqing Ruan; Xuantang Xiong; Dengpeng; Xing; Bo Xu

arXiv:2307.12063·cs.LG·July 25, 2023

Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Qingyang Zhang, Yiming Yang, Jingqing Ruan, Xuantang Xiong, Dengpeng, Xing, Bo Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces HILL, a hierarchical reinforcement learning method that learns temporally coherent latent subgoal representations and dynamically builds landmark graphs to effectively balance exploration and exploitation in complex tasks.

Contribution

HILL proposes a novel approach to learning latent subgoal representations with temporal coherence and a dynamic landmark graph construction for improved subgoal selection.

Findings

01

HILL outperforms state-of-the-art methods on continuous control tasks.

02

HILL improves sample efficiency and asymptotic performance.

03

The approach effectively balances exploration and exploitation.

Abstract

Goal-Conditioned Hierarchical Reinforcement Learning (GCHRL) is a promising paradigm to address the exploration-exploitation dilemma in reinforcement learning. It decomposes the source task into subgoal conditional subtasks and conducts exploration and exploitation in the subgoal space. The effectiveness of GCHRL heavily relies on subgoal representation functions and subgoal selection strategy. However, existing works often overlook the temporal coherence in GCHRL when learning latent subgoal representations and lack an efficient subgoal selection strategy that balances exploration and exploitation. This paper proposes HIerarchical reinforcement learning via dynamically building Latent Landmark graphs (HILL) to overcome these limitations. HILL learns latent subgoal representations that satisfy temporal coherence using a contrastive representation learning objective. Based on these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

papercode2022/hill
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics