Hierarchical Reinforcement Learning under Mixed Observability

Hai Nguyen; Zhihan Yang; Andrea Baisero; Xiao Ma; Robert Platt,; Christopher Amato

arXiv:2204.00898·cs.RO·June 7, 2022

Hierarchical Reinforcement Learning under Mixed Observability

Hai Nguyen, Zhihan Yang, Andrea Baisero, Xiao Ma, Robert Platt,, Christopher Amato

PDF

Open Access

TL;DR

This paper introduces HILMO, a hierarchical reinforcement learning framework for a subclass of MOMDPs, improving learning efficiency and success rates in robotic control tasks by leveraging partial observability at the top level.

Contribution

The paper proposes a novel hierarchical RL approach tailored for a specific subclass of MOMDPs, with theoretical guarantees and practical validation on robotic tasks.

Findings

01

Enhanced success rate in robotic control tasks

02

Improved sample efficiency and training time

03

Successful deployment on real robots

Abstract

The framework of mixed observable Markov decision processes (MOMDP) models many robotic domains in which some state variables are fully observable while others are not. In this work, we identify a significant subclass of MOMDPs defined by how actions influence the fully observable components of the state and how those, in turn, influence the partially observable components and the rewards. This unique property allows for a two-level hierarchical approach we call HIerarchical Reinforcement Learning under Mixed Observability (HILMO), which restricts partial observability to the top level while the bottom level remains fully observable, enabling higher learning efficiency. The top level produces desired goals to be reached by the bottom level until the task is solved. We further develop theoretical guarantees to show that our approach can achieve optimal and quasi-optimal behavior under…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics