Realizable Abstractions: Near-Optimal Hierarchical Reinforcement Learning
Roberto Cipollone, Luca Iocchi, Matteo Leonetti

TL;DR
This paper introduces Realizable Abstractions for hierarchical reinforcement learning, providing a formal framework with near-optimal guarantees, and proposes RARL, an algorithm that efficiently learns near-optimal policies using these abstractions.
Contribution
It defines a new formal notion of Realizable Abstractions that overcomes previous limitations, and develops RARL, a practical algorithm leveraging these abstractions for efficient learning.
Findings
RARL converges in polynomial time.
It is robust to abstraction inaccuracies.
Provides near-optimal policies via compositional options.
Abstract
The main focus of Hierarchical Reinforcement Learning (HRL) is studying how large Markov Decision Processes (MDPs) can be more efficiently solved when addressed in a modular way, by combining partial solutions computed for smaller subtasks. Despite their very intuitive role for learning, most notions of MDP abstractions proposed in the HRL literature have limited expressive power or do not possess formal efficiency guarantees. This work addresses these fundamental issues by defining Realizable Abstractions, a new relation between generic low-level MDPs and their associated high-level decision processes. The notion we propose avoids non-Markovianity issues and has desirable near-optimality guarantees. Indeed, we show that any abstract policy for Realizable Abstractions can be translated into near-optimal policies for the low-level MDP, through a suitable composition of options. As…
Peer Reviews
Decision·Submitted to ICLR 2025
Very clear, very well written.
I appreciated the example illustrated in figure 1. I would suggest the authors refer to it when introducing new concepts and when discussing RARL.
The paper addresses an important problem in hierarchical RL dealing with abstractions which relate high-level and low-level representations. Connecting hierarchical abstractions to constrained MDPs such that options can be extracted by solving the CMDPs with off-the-shelf algorithms is interesting and, to my knowledge, novel. The paper is well-written and the intuitive explanations for the theory are fairly easy to follow, though it is extremely notation-heavy.
While an algorithm is proposed (RARL), there are no empirical results to support it and validate the assumptions that are made for the guarantees in Section 4. I would like to see RARL compared with existing methods, e.g., some form of option-critic (with specified options) or deep skill chaining (for a skill discovery comparison). Additionally, as it is not clear to me how reasonable Assumptions 1-3 are, the paper would be strengthened by experiments showing how RARL is affected by violations o
Realizable Abstractions offer a fresh theoretical foundation for HRL, opening up a promising way for potentially reducing sample complexity in reinforcement learning. I find it particularly intriguing that sparse rewards play a crucial role in ensuring admissibility (Proposition 6).
While the new concept of Realizable Abstractions is interesting, the paper lacks some key definitions, making it challenging to follow. As a result, it’s difficult to fully grasp the significance of the paper’s main contributions. The followings are main weaknesses of this paper: - The proposed algorithm requires a **known** abstraction and assumes admissibility, which feels like a rather strong assumption. However, there is not enough rigorous discussion regarding these assumptions and their
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference
