Expert-Guided Symmetry Detection in Markov Decision Processes

Giorgio Angelotti; Nicolas Drougard; Caroline P. C. Chanel

arXiv:2111.10297·cs.LG·March 8, 2022

Expert-Guided Symmetry Detection in Markov Decision Processes

Giorgio Angelotti, Nicolas Drougard, Caroline P. C. Chanel

PDF

TL;DR

This paper introduces a method to detect and leverage symmetries in Markov Decision Processes using density estimation, improving data efficiency and model learning in reinforcement learning environments.

Contribution

It proposes a novel symmetry detection paradigm based on density estimation, enhancing MDP learning and policy computation by exploiting invariant transformations.

Findings

01

Reduced model distributional shift with symmetry-based data augmentation

02

Improved data efficiency in learning transition functions

03

Effective symmetry detection in benchmark environments

Abstract

Learning a Markov Decision Process (MDP) from a fixed batch of trajectories is a non-trivial task whose outcome's quality depends on both the amount and the diversity of the sampled regions of the state-action space. Yet, many MDPs are endowed with invariant reward and transition functions with respect to some transformations of the current state and action. Being able to detect and exploit these structures could benefit not only the learning of the MDP but also the computation of its subsequent optimal control policy. In this work we propose a paradigm, based on Density Estimation methods, that aims to detect the presence of some already supposed transformations of the state-action space for which the MDP dynamics is invariant. We tested the proposed approach in a discrete toroidal grid environment and in two notorious environments of OpenAI's Gym Learning Suite. The results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.