Reinforcement Learning in Presence of Discrete Markovian Context Evolution
Hang Ren, Aivar Sootla, Taher Jafferjee, Junxiao Shen, Jun Wang and, Haitham Bou-Ammar

TL;DR
This paper introduces a Bayesian approach with variational inference for context-dependent reinforcement learning with unknown, changing, Markovian contexts, enabling effective policy learning and outperforming existing methods in complex environments.
Contribution
It proposes a novel Bayesian framework using a sticky Hierarchical Dirichlet Process for modeling and inferring multiple Markovian contexts in RL, including a context distillation procedure.
Findings
Successfully infers the number of contexts from data.
Outperforms state-of-the-art methods in complex RL environments.
Demonstrates robustness to abrupt and discontinuous context changes.
Abstract
We consider a context-dependent Reinforcement Learning (RL) setting, which is characterized by: a) an unknown finite number of not directly observable contexts; b) abrupt (discontinuous) context changes occurring during an episode; and c) Markovian context evolution. We argue that this challenging case is often met in applications and we tackle it using a Bayesian approach and variational inference. We adapt a sticky Hierarchical Dirichlet Process (HDP) prior for model learning, which is arguably best-suited for Markov process modeling. We then derive a context distillation procedure, which identifies and removes spurious contexts in an unsupervised fashion. We argue that the combination of these two components allows to infer the number of contexts from data thus dealing with the context cardinality assumption. We then find the representation of the optimal policy enabling efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics
