Learning Robust Controllers Via Probabilistic Model-Based Policy Search
Valentin Charvet, Bj{\o}rn Sand Jensen, Roderick Murray-Smith

TL;DR
This paper proposes a method to improve the robustness of controllers learned through probabilistic model-based policy search by regularizing the Gaussian Process dynamics model, leading to better generalization under environmental perturbations.
Contribution
It introduces a regularization technique for Gaussian Process models in model-based RL to enhance controller robustness against small environment changes.
Findings
Regularized models produce more robust controllers.
Empirical improvements shown in simulation benchmarks.
Enhanced generalization under environmental perturbations.
Abstract
Model-based Reinforcement Learning estimates the true environment through a world model in order to approximate the optimal policy. This family of algorithms usually benefits from better sample efficiency than their model-free counterparts. We investigate whether controllers learned in such a way are robust and able to generalize under small perturbations of the environment. Our work is inspired by the PILCO algorithm, a method for probabilistic policy search. We show that enforcing a lower bound to the likelihood noise in the Gaussian Process dynamics model regularizes the policy updates and yields more robust controllers. We demonstrate the empirical benefits of our method in a simulation benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Advanced Control Systems Optimization
MethodsGaussian Process
