Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics
Nicol\`o Botteghi, Mannes Poel, Beril Sirmacek, Christoph Brune

TL;DR
This paper introduces a framework for sample-efficient deep reinforcement learning by learning low-dimensional, interpretable state and action representations, enabling effective policy learning and transfer from high-dimensional observations.
Contribution
It proposes a novel method that leverages state and action representations with MDP homomorphism metrics to improve sample efficiency and interpretability in reinforcement learning.
Findings
Efficient learning of low-dimensional, interpretable representations.
Successful transfer of policies from latent to original domain.
Enhanced sample efficiency in complex environments.
Abstract
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. However, in end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. In this work, we proposed a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one. Moreover, we seek to find the optimal policy mapping latent states to latent actions. Because now the policy is learned on abstract representations, we enforce, using auxiliary loss functions, the lifting of such policy to the original problem domain. Results show that the novel framework can efficiently learn low-dimensional and interpretable state and action representations and the optimal latent policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Data Stream Mining Techniques
