Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards
Siyuan Li, Rui Wang, Minxue Tang, Chongjie Zhang

TL;DR
This paper introduces a hierarchical reinforcement learning framework that uses advantage-based auxiliary rewards to adapt low-level skills to new tasks efficiently, without domain-specific reward engineering.
Contribution
We propose a novel HRL method that sets auxiliary rewards based on the advantage function, enabling simultaneous learning and transferability of policies.
Findings
Outperforms state-of-the-art HRL methods in Mujoco domains
Enables transfer of trained policies to new tasks
Theoretically guarantees increased task return with auxiliary rewards
Abstract
Hierarchical Reinforcement Learning (HRL) is a promising approach to solving long-horizon problems with sparse and delayed rewards. Many existing HRL algorithms either use pre-trained low-level skills that are unadaptable, or require domain-specific information to define low-level rewards. In this paper, we aim to adapt low-level skills to downstream tasks while maintaining the generality of reward design. We propose an HRL framework which sets auxiliary rewards for low-level skill training based on the advantage function of the high-level policy. This auxiliary reward enables efficient, simultaneous learning of the high-level policy and low-level skills without using task-specific knowledge. In addition, we also theoretically prove that optimizing low-level skills with this auxiliary reward will increase the task return for the joint policy. Experimental results show that our algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Reinforcement Learning in Robotics · Evolutionary Algorithms and Applications
