Adjacency constraint for efficient hierarchical reinforcement learning
Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen

TL;DR
This paper introduces an adjacency constraint in hierarchical reinforcement learning to limit the high-level goal space, which improves training efficiency and performance in complex control tasks.
Contribution
It proposes a novel adjacency constraint method that preserves optimal policies in deterministic MDPs and bounds suboptimality in stochastic MDPs, with practical implementation via an adjacency network.
Findings
Significant performance improvements on robot locomotion tasks.
Effective reduction of goal space complexity.
Theoretical guarantees for policy optimality and suboptimality bounds.
Abstract
Goal-conditioned Hierarchical Reinforcement Learning (HRL) is a promising approach for scaling up reinforcement learning (RL) techniques. However, it often suffers from training inefficiency as the action space of the high-level, i.e., the goal space, is large. Searching in a large goal space poses difficulty for both high-level subgoal generation and low-level policy learning. In this paper, we show that this problem can be effectively alleviated by restricting the high-level action space from the whole goal space to a -step adjacent region of the current state using an adjacency constraint. We theoretically prove that in a deterministic Markov Decision Process (MDP), the proposed adjacency constraint preserves the optimal hierarchical policy, while in a stochastic MDP the adjacency constraint induces a bounded state-value suboptimality determined by the MDP's transition structure.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
