Blockwise Sequential Model Learning for Partially Observable   Reinforcement Learning

Giseung Park; Sungho Choi; Youngchul Sung

arXiv:2112.05343·cs.LG·December 13, 2021

Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning

Giseung Park, Sungho Choi, Youngchul Sung

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a blockwise sequential learning architecture using self-attention for partially observable reinforcement learning, effectively capturing long-term dependencies without complex data reconstruction.

Contribution

The paper presents a novel blockwise sequential model with self-attention for better handling partial observability in reinforcement learning, improving over traditional RNN-based methods.

Findings

01

Significantly outperforms previous methods in various environments.

02

Efficient gradient estimation using self-normalized importance sampling.

03

Capable of detailed sequential learning in partial observable settings.

Abstract

This paper proposes a new sequential model learning architecture to solve partially observable Markov decision problems. Rather than compressing sequential information at every timestep as in conventional recurrent neural network-based methods, the proposed architecture generates a latent variable in each data block with a length of multiple timesteps and passes the most relevant information to the next block for policy optimization. The proposed blockwise sequential model is implemented based on self-attention, making the model capable of detailed sequential learning in partial observable settings. The proposed model builds an additional learning network to efficiently implement gradient estimation by using self-normalized importance sampling, which does not require the complex blockwise input data reconstruction in the model learning. Numerical results show that the proposed method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

giseung-park/blockseq
pytorchOfficial

Videos

Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Fault Detection and Control Systems · Gaussian Processes and Bayesian Inference