Bidirectional-Reachable Hierarchical Reinforcement Learning with   Mutually Responsive Policies

Yu Luo; Fuchun Sun; Tianying Ji; Xianyuan Zhan

arXiv:2406.18053·cs.LG·June 27, 2024

Bidirectional-Reachable Hierarchical Reinforcement Learning with Mutually Responsive Policies

Yu Luo, Fuchun Sun, Tianying Ji, Xianyuan Zhan

PDF

Open Access 1 Repo

TL;DR

This paper introduces BrHPO, a hierarchical reinforcement learning algorithm with bidirectional communication between levels, improving long-horizon task performance by enhancing subgoal reachability and exploration efficiency.

Contribution

The paper proposes a novel bidirectional-reachable mechanism for HRL, enabling real-time mutual feedback between levels to overcome local optima and improve task success.

Findings

01

BrHPO outperforms state-of-the-art HRL methods in various long-horizon tasks.

02

It achieves higher exploration efficiency and robustness.

03

The approach is computationally efficient.

Abstract

Hierarchical reinforcement learning (HRL) addresses complex long-horizon tasks by skillfully decomposing them into subgoals. Therefore, the effectiveness of HRL is greatly influenced by subgoal reachability. Typical HRL methods only consider subgoal reachability from the unilateral level, where a dominant level enforces compliance to the subordinate level. However, we observe that when the dominant level becomes trapped in local exploration or generates unattainable subgoals, the subordinate level is negatively affected and cannot follow the dominant level's actions. This can potentially make both levels stuck in local optima, ultimately hindering subsequent subgoal reachability. Allowing real-time bilateral information sharing and error correction would be a natural cure for this issue, which motivates us to propose a mutual response mechanism. Based on this, we propose the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

roythuly/brhpo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Supply Chain and Inventory Management · Mobile Crowdsensing and Crowdsourcing