On the benefits of pixel-based hierarchical policies for task   generalization

Tudor Cristea-Platon; Bogdan Mazoure; Josh Susskind; Walter Talbott

arXiv:2407.19142·cs.LG·July 30, 2024

On the benefits of pixel-based hierarchical policies for task generalization

Tudor Cristea-Platon, Bogdan Mazoure, Josh Susskind, Walter Talbott

PDF

Open Access

TL;DR

This paper demonstrates that pixel-based hierarchical policies in reinforcement learning improve multi-task generalization, training performance, and reduce fine-tuning complexity in robotic control tasks.

Contribution

It provides empirical evidence that hierarchical policies trained with task conditioning enhance generalization and performance in pixel-based multi-task reinforcement learning.

Findings

01

Hierarchical policies improve training task performance.

02

They enhance reward and state-space generalization.

03

They reduce fine-tuning complexity for new tasks.

Abstract

Reinforcement learning practitioners often avoid hierarchical policies, especially in image-based observation spaces. Typically, the single-task performance improvement over flat-policy counterparts does not justify the additional complexity associated with implementing a hierarchy. However, by introducing multiple decision-making levels, hierarchical policies can compose lower-level policies to more effectively generalize between tasks, highlighting the need for multi-task evaluations. We analyze the benefits of hierarchy through simulated multi-task robotic control experiments from pixels. Our results show that hierarchical policies trained with task conditioning can (1) increase performance on training tasks, (2) lead to improved reward and state-space generalizations in similar tasks, and (3) decrease the complexity of fine tuning required to solve novel tasks. Thus, we believe that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics