SafePILCO: a software tool for safe and data-efficient policy synthesis

Kyriakos Polymenakos; Nikitas Rontsis; Alessandro Abate; Stephen; Roberts

arXiv:2008.03273·cs.LG·August 10, 2020

SafePILCO: a software tool for safe and data-efficient policy synthesis

Kyriakos Polymenakos, Nikitas Rontsis, Alessandro Abate, Stephen, Roberts

PDF

Open Access 1 Repo

TL;DR

SafePILCO is a Python-based software tool that enhances the PILCO reinforcement learning algorithm to enable safe, data-efficient policy synthesis, making it more accessible for verification and control applications.

Contribution

It introduces a modular Python implementation of SafePILCO, extending PILCO to support safe learning and broadening its usability across communities.

Findings

01

Provides a practical, safe reinforcement learning tool

02

Supports data-efficient policy search in continuous control

03

Facilitates wider adoption through modular design

Abstract

SafePILCO is a software tool for safe and data-efficient policy search with reinforcement learning. It extends the known PILCO algorithm, originally written in MATLAB, to support safe learning. We provide a Python implementation and leverage existing libraries that allow the codebase to remain short and modular, which is appropriate for wider use by the verification, reinforcement learning, and control communities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nrontsis/PILCO
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Formal Methods in Verification