A compact, hierarchical Q-function decomposition
Bhaskara Marthi, Stuart Russell, David Andre

TL;DR
This paper introduces a hierarchical Q-function decomposition method that efficiently captures exit state values, enabling more compact representations and better decision-making in hierarchical reinforcement learning.
Contribution
It proposes a recursive decomposition of exit value functions, reducing representation costs and improving hierarchical RL performance.
Findings
Effective in complex environments
Reduces representation complexity
Improves hierarchical decision-making
Abstract
Previous work in hierarchical reinforcement learning has faced a dilemma: either ignore the values of different possible exit states from a subroutine, thereby risking suboptimal behavior, or represent those values explicitly thereby incurring a possibly large representation cost because exit values refer to nonlocal aspects of the world (i.e., all subsequent rewards). This paper shows that, in many cases, one can avoid both of these problems. The solution is based on recursively decomposing the exit value function in terms of Q-functions at higher levels of the hierarchy. This leads to an intuitively appealing runtime architecture in which a parent subroutine passes to its child a value function on the exit states and the child reasons about how its choices affect the exit value. We also identify structural conditions on the value function and transition distributions that allow much…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Advanced Control Systems Optimization
