Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning
Gavin B. Rens

TL;DR
This paper introduces HGCPP, a hierarchical framework combining goal-conditioned policies, Monte Carlo Tree Search, and hierarchical reinforcement learning to improve planning and learning efficiency in complex, goal-oriented tasks for humanoid robots.
Contribution
The paper presents a novel hierarchical planning method that integrates goal-conditioned policies with MCTS, enabling efficient exploration and reuse of high-level actions in multi-goal reinforcement learning.
Findings
Enhanced sample efficiency in complex tasks
Faster reasoning through hierarchy and reuse of HLAs
Improved exploration capabilities in sparse reward environments
Abstract
Humanoid robots must master numerous tasks with sparse rewards, posing a challenge for reinforcement learning (RL). We propose a method combining RL and automated planning to address this. Our approach uses short goal-conditioned policies (GCPs) organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs). Instead of primitive actions, the planning process generates HLAs. A single plan-tree, maintained during the agent's lifetime, holds knowledge about goal achievement. This hierarchy enhances sample efficiency and speeds up reasoning by reusing HLAs and anticipating future actions. Our Hierarchical Goal-Conditioned Policy Planning (HGCPP) framework uniquely integrates GCPs, MCTS, and hierarchical RL, potentially improving exploration and planning in complex tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Traffic control and management · Elevator Systems and Control
