Learning Robust Controllers Via Probabilistic Model-Based Policy Search

Valentin Charvet; Bj{\o}rn Sand Jensen; Roderick Murray-Smith

arXiv:2110.13576·cs.LG·October 27, 2021

Learning Robust Controllers Via Probabilistic Model-Based Policy Search

Valentin Charvet, Bj{\o}rn Sand Jensen, Roderick Murray-Smith

PDF

Open Access

TL;DR

This paper proposes a method to improve the robustness of controllers learned through probabilistic model-based policy search by regularizing the Gaussian Process dynamics model, leading to better generalization under environmental perturbations.

Contribution

It introduces a regularization technique for Gaussian Process models in model-based RL to enhance controller robustness against small environment changes.

Findings

01

Regularized models produce more robust controllers.

02

Empirical improvements shown in simulation benchmarks.

03

Enhanced generalization under environmental perturbations.

Abstract

Model-based Reinforcement Learning estimates the true environment through a world model in order to approximate the optimal policy. This family of algorithms usually benefits from better sample efficiency than their model-free counterparts. We investigate whether controllers learned in such a way are robust and able to generalize under small perturbations of the environment. Our work is inspired by the PILCO algorithm, a method for probabilistic policy search. We show that enforcing a lower bound to the likelihood noise in the Gaussian Process dynamics model regularizes the policy updates and yields more robust controllers. We demonstrate the empirical benefits of our method in a simulation benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Advanced Control Systems Optimization

MethodsGaussian Process