Hierarchical Reinforcement Learning under Mixed Observability
Hai Nguyen, Zhihan Yang, Andrea Baisero, Xiao Ma, Robert Platt,, Christopher Amato

TL;DR
This paper introduces HILMO, a hierarchical reinforcement learning framework for a subclass of MOMDPs, improving learning efficiency and success rates in robotic control tasks by leveraging partial observability at the top level.
Contribution
The paper proposes a novel hierarchical RL approach tailored for a specific subclass of MOMDPs, with theoretical guarantees and practical validation on robotic tasks.
Findings
Enhanced success rate in robotic control tasks
Improved sample efficiency and training time
Successful deployment on real robots
Abstract
The framework of mixed observable Markov decision processes (MOMDP) models many robotic domains in which some state variables are fully observable while others are not. In this work, we identify a significant subclass of MOMDPs defined by how actions influence the fully observable components of the state and how those, in turn, influence the partially observable components and the rewards. This unique property allows for a two-level hierarchical approach we call HIerarchical Reinforcement Learning under Mixed Observability (HILMO), which restricts partial observability to the top level while the bottom level remains fully observable, enabling higher learning efficiency. The top level produces desired goals to be reached by the bottom level until the task is solved. We further develop theoretical guarantees to show that our approach can achieve optimal and quasi-optimal behavior under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
