Training and Evaluation of Deep Policies using Reinforcement Learning   and Generative Models

Ali Ghadirzadeh; Petra Poklukar; Karol Arndt; Chelsea Finn; Ville; Kyrki; Danica Kragic; M{\aa}rten Bj\"orkman

arXiv:2204.08573·cs.LG·April 20, 2022·1 cites

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Ali Ghadirzadeh, Petra Poklukar, Karol Arndt, Chelsea Finn, Ville, Kyrki, Danica Kragic, M{\aa}rten Bj\"orkman

PDF

Open Access

TL;DR

This paper introduces GenRL, a data-efficient framework combining reinforcement learning and generative models for safe, effective policy training in robotics, with predictive evaluation measures and superior performance over existing methods.

Contribution

The paper proposes GenRL, a novel approach that integrates latent variable generative models with RL to improve data efficiency and safety in robotic policy learning.

Findings

01

GenRL outperforms two state-of-the-art RL methods in robotics tasks.

02

Generative models' characteristics significantly influence policy performance.

03

Evaluation measures can predict RL policy success before physical training.

Abstract

We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning