Deep Hierarchical Reinforcement Learning Algorithm in Partially   Observable Markov Decision Processes

Le Pham Tuyen; Ngo Anh Vien; Abu Layek; TaeChoong Chung

arXiv:1805.04419·cs.AI·October 30, 2024

Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

Le Pham Tuyen, Ngo Anh Vien, Abu Layek, TaeChoong Chung

PDF

TL;DR

This paper introduces a deep hierarchical reinforcement learning algorithm designed for partially observable Markov decision processes, addressing the challenges of hierarchical and partial observability in complex RL tasks.

Contribution

It proposes a novel deep hierarchical RL method applicable to both MDPs and POMDPs, enhancing learning in hierarchical, partially observable environments.

Findings

01

Effective in complex hierarchical POMDPs

02

Improves learning efficiency in partially observable settings

03

Demonstrates superior performance over baseline methods

Abstract

In recent years, reinforcement learning has achieved many remarkable successes due to the growing adoption of deep learning techniques and the rapid growth in computing power. Nevertheless, it is well-known that flat reinforcement learning algorithms are often not able to learn well and data-efficient in tasks having hierarchical structures, e.g. consisting of multiple subtasks. Hierarchical reinforcement learning is a principled approach that is able to tackle these challenging tasks. On the other hand, many real-world tasks usually have only partial observability in which state measurements are often imperfect and partially observable. The problems of RL in such settings can be formulated as a partially observable Markov decision process (POMDP). In this paper, we study hierarchical RL in POMDP in which the tasks have only partial observability and possess hierarchical properties. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.