Low-Dimensional State and Action Representation Learning with MDP   Homomorphism Metrics

Nicol\`o Botteghi; Mannes Poel; Beril Sirmacek; Christoph Brune

arXiv:2107.01677·cs.LG·July 6, 2021·1 cites

Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics

Nicol\`o Botteghi, Mannes Poel, Beril Sirmacek, Christoph Brune

PDF

Open Access

TL;DR

This paper introduces a framework for sample-efficient deep reinforcement learning by learning low-dimensional, interpretable state and action representations, enabling effective policy learning and transfer from high-dimensional observations.

Contribution

It proposes a novel method that leverages state and action representations with MDP homomorphism metrics to improve sample efficiency and interpretability in reinforcement learning.

Findings

01

Efficient learning of low-dimensional, interpretable representations.

02

Successful transfer of policies from latent to original domain.

03

Enhanced sample efficiency in complex environments.

Abstract

Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. However, in end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. In this work, we proposed a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one. Moreover, we seek to find the optimal policy mapping latent states to latent actions. Because now the policy is learned on abstract representations, we enforce, using auxiliary loss functions, the lifting of such policy to the original problem domain. Results show that the novel framework can efficiently learn low-dimensional and interpretable state and action representations and the optimal latent policy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Data Stream Mining Techniques