Reinforcement Learning in Presence of Discrete Markovian Context   Evolution

Hang Ren; Aivar Sootla; Taher Jafferjee; Junxiao Shen; Jun Wang and; Haitham Bou-Ammar

arXiv:2202.06557·cs.LG·February 15, 2022

Reinforcement Learning in Presence of Discrete Markovian Context Evolution

Hang Ren, Aivar Sootla, Taher Jafferjee, Junxiao Shen, Jun Wang and, Haitham Bou-Ammar

PDF

Open Access 1 Video

TL;DR

This paper introduces a Bayesian approach with variational inference for context-dependent reinforcement learning with unknown, changing, Markovian contexts, enabling effective policy learning and outperforming existing methods in complex environments.

Contribution

It proposes a novel Bayesian framework using a sticky Hierarchical Dirichlet Process for modeling and inferring multiple Markovian contexts in RL, including a context distillation procedure.

Findings

01

Successfully infers the number of contexts from data.

02

Outperforms state-of-the-art methods in complex RL environments.

03

Demonstrates robustness to abrupt and discontinuous context changes.

Abstract

We consider a context-dependent Reinforcement Learning (RL) setting, which is characterized by: a) an unknown finite number of not directly observable contexts; b) abrupt (discontinuous) context changes occurring during an episode; and c) Markovian context evolution. We argue that this challenging case is often met in applications and we tackle it using a Bayesian approach and variational inference. We adapt a sticky Hierarchical Dirichlet Process (HDP) prior for model learning, which is arguably best-suited for Markov process modeling. We then derive a context distillation procedure, which identifies and removes spurious contexts in an unsupervised fashion. We argue that the combination of these two components allows to infer the number of contexts from data thus dealing with the context cardinality assumption. We then find the representation of the optimal policy enabling efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Reinforcement Learning in Presence of Discrete Markovian Context Evolution· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics