Globally Optimal Hierarchical Reinforcement Learning for   Linearly-Solvable Markov Decision Processes

Guillermo Infante; Anders Jonsson; Vicen\c{c} G\'omez

arXiv:2106.15380·cs.LG·June 4, 2024

Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

Guillermo Infante, Anders Jonsson, Vicen\c{c} G\'omez

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a hierarchical reinforcement learning method for linearly-solvable Markov decision processes that efficiently learns globally optimal policies by leveraging state space partitioning and value function compositionality.

Contribution

It presents a novel hierarchical approach that estimates optimal values at multiple abstraction levels, enabling globally optimal policy learning without non-stationarity issues.

Findings

01

Can learn globally optimal policies

02

Reduces sample complexity in certain settings

03

Validated through multiple experiments

Abstract

In this work we present a novel approach to hierarchical reinforcement learning for linearly-solvable Markov decision processes. Our approach assumes that the state space is partitioned, and the subtasks consist in moving between the partitions. We represent value functions on several levels of abstraction, and use the compositionality of subtasks to estimate the optimal values of the states in each partition. The policy is implicitly defined on these optimal value estimates, rather than being decomposed among the subtasks. As a consequence, our approach can learn the globally optimal policy, and does not suffer from the non-stationarity of high-level decisions. If several partitions have equivalent dynamics, the subtasks of those partitions can be shared. If the set of boundary states is smaller than the entire state space, our approach can have significantly smaller sample complexity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guillermoim/HRL_LMDP
noneOfficial

Videos

Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes· underline

Taxonomy

TopicsReinforcement Learning in Robotics