State Abstraction in MAXQ Hierarchical Reinforcement Learning
Thomas G. Dietterich

TL;DR
This paper investigates how state abstractions can be integrated into the MAXQ hierarchical reinforcement learning framework, providing theoretical convergence guarantees and demonstrating their importance through experiments.
Contribution
It defines five conditions for combining state abstraction with MAXQ, proves convergence of MAXQ-Q learning under these conditions, and highlights the significance of state abstraction in practice.
Findings
MAXQ-Q converges under specified conditions
State abstraction improves learning efficiency
Experimental results confirm the importance of state abstraction
Abstract
Many researchers have explored methods for hierarchical reinforcement learning (RL) with temporal abstractions, in which abstract actions are defined that can perform many primitive actions before terminating. However, little is known about learning with state abstractions, in which aspects of the state space are ignored. In previous work, we developed the MAXQ method for hierarchical RL. In this paper, we define five conditions under which state abstraction can be combined with the MAXQ value function decomposition. We prove that the MAXQ-Q learning algorithm converges under these conditions and show experimentally that state abstraction is important for the successful application of MAXQ-Q learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Adaptive Dynamic Programming Control
