Provable Hierarchy-Based Meta-Reinforcement Learning
Kurtland Chua, Qi Lei, Jason D. Lee

TL;DR
This paper introduces a provable method for hierarchy-based meta-reinforcement learning, enabling efficient learning of hierarchical structures during meta-training with theoretical guarantees, addressing limitations of prior heuristic approaches.
Contribution
It provides the first theoretical analysis of hierarchy learning in meta-RL, with guarantees for recovering natural hierarchies and bounds on downstream task performance.
Findings
Guarantees for hierarchy recovery under diversity conditions
Sample-efficient meta-training for hierarchical structures
Regret bounds for downstream task performance
Abstract
Hierarchical reinforcement learning (HRL) has seen widespread interest as an approach to tractable learning of complex modular behaviors. However, existing work either assume access to expert-constructed hierarchies, or use hierarchy-learning heuristics with no provable guarantees. To address this gap, we analyze HRL in the meta-RL setting, where a learner learns latent hierarchical structure during meta-training for use in a downstream task. We consider a tabular setting where natural hierarchical structure is embedded in the transition dynamics. Analogous to supervised meta-learning theory, we provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy. Furthermore, we provide regret bounds on a learner using the recovered hierarchy to solve a meta-test task. Our bounds incorporate common…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Autonomous Vehicle Technology and Safety
