Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal   Reinforcement Learning

Gavin B. Rens

arXiv:2501.01727·cs.AI·January 6, 2025

Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

Gavin B. Rens

PDF

Open Access

TL;DR

This paper introduces HGCPP, a hierarchical framework combining goal-conditioned policies, Monte Carlo Tree Search, and hierarchical reinforcement learning to improve planning and learning efficiency in complex, goal-oriented tasks for humanoid robots.

Contribution

The paper presents a novel hierarchical planning method that integrates goal-conditioned policies with MCTS, enabling efficient exploration and reuse of high-level actions in multi-goal reinforcement learning.

Findings

01

Enhanced sample efficiency in complex tasks

02

Faster reasoning through hierarchy and reuse of HLAs

03

Improved exploration capabilities in sparse reward environments

Abstract

Humanoid robots must master numerous tasks with sparse rewards, posing a challenge for reinforcement learning (RL). We propose a method combining RL and automated planning to address this. Our approach uses short goal-conditioned policies (GCPs) organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs). Instead of primitive actions, the planning process generates HLAs. A single plan-tree, maintained during the agent's lifetime, holds knowledge about goal achievement. This hierarchy enhances sample efficiency and speeds up reasoning by reusing HLAs and anticipating future actions. Our Hierarchical Goal-Conditioned Policy Planning (HGCPP) framework uniquely integrates GCPs, MCTS, and hierarchical RL, potentially improving exploration and planning in complex tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Traffic control and management · Elevator Systems and Control