Black-Box Data-efficient Policy Search for Robotics

Konstantinos Chatzilygeroudis; Roberto Rama; Rituraj Kaushik; Dorian; Goepp; Vassilis Vassiliades; Jean-Baptiste Mouret

arXiv:1703.07261·cs.RO·July 25, 2017·2 cites

Black-Box Data-efficient Policy Search for Robotics

Konstantinos Chatzilygeroudis, Roberto Rama, Rituraj Kaushik, Dorian, Goepp, Vassilis Vassiliades, Jean-Baptiste Mouret

PDF

Open Access 1 Repo

TL;DR

This paper introduces Black-DROPS, a model-based reinforcement learning algorithm that is flexible, data-efficient, and fast, capable of optimizing policies without constraints on reward functions or policy types, demonstrated on simulations and real robots.

Contribution

Black-DROPS is a novel black-box RL algorithm that handles model uncertainties and optimizes policies without restrictions, matching state-of-the-art data efficiency and improving speed with parallel processing.

Findings

01

Black-DROPS achieves high data efficiency comparable to existing methods.

02

The algorithm performs well on standard control benchmarks in simulation.

03

It demonstrates effective real-world robotic control with a low-cost manipulator.

Abstract

The most data-efficient algorithms for reinforcement learning (RL) in robotics are based on uncertain dynamical models: after each episode, they first learn a dynamical model of the robot, then they use an optimization algorithm to find a policy that maximizes the expected return given the model and its uncertainties. It is often believed that this optimization can be tractable only if analytical, gradient-based algorithms are used; however, these algorithms require using specific families of reward functions and policies, which greatly limits the flexibility of the overall approach. In this paper, we introduce a novel model-based RL algorithm, called Black-DROPS (Black-box Data-efficient RObot Policy Search) that: (1) does not impose any constraint on the reward function or the policy (they are treated as black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

resibots/blackdrops
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Robotic Path Planning Algorithms