Low Dimensional State Representation Learning with Reward-shaped Priors

Nicol\`o Botteghi; Ruben Obbink; Daan Geijs; Mannes Poel; Beril; Sirmacek; Christoph Brune; Abeje Mersha; Stefano Stramigioli

arXiv:2007.16044·cs.LG·August 2, 2021

Low Dimensional State Representation Learning with Reward-shaped Priors

Nicol\`o Botteghi, Ruben Obbink, Daan Geijs, Mannes Poel, Beril, Sirmacek, Christoph Brune, Abeje Mersha, Stefano Stramigioli

PDF

TL;DR

This paper introduces a method for learning low-dimensional state representations in reinforcement learning for robotics, using reward-shaped priors to improve sample efficiency and facilitate faster policy learning.

Contribution

It proposes an unsupervised learning approach that incorporates environment and task priors into the state representation, enhancing sample efficiency in robotic RL tasks.

Findings

01

Effective in simulation navigation tasks

02

Successful transfer to real robot experiments

03

Improved sample efficiency in policy learning

Abstract

Reinforcement Learning has been able to solve many complicated robotics tasks without any need for feature engineering in an end-to-end fashion. However, learning the optimal policy directly from the sensory inputs, i.e the observations, often requires processing and storage of a huge amount of data. In the context of robotics, the cost of data from real robotics hardware is usually very high, thus solutions that achieve high sample-efficiency are needed. We propose a method that aims at learning a mapping from the observations into a lower-dimensional state space. This mapping is learned with unsupervised learning using loss functions shaped to incorporate prior knowledge of the environment and the task. Using the samples from the state space, the optimal policy is quickly and efficiently learned. We test the method on several mobile robot navigation tasks in a simulation environment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.