Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with   Gaussian Processes

Joe Watson; Jan Peters

arXiv:2210.03512·cs.LG·October 10, 2022

Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Joe Watson, Jan Peters

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel Monte Carlo posterior policy iteration method using Gaussian process priors to achieve smooth control in high-dimensional robot tasks, enhancing sample efficiency and control smoothness.

Contribution

It proposes a new inference-based control approach with Gaussian process priors and an optimized likelihood temperature for smoother, more effective robot control.

Findings

01

Achieves smooth control with high sample efficiency

02

Matches prior heuristic methods in performance

03

Demonstrates effectiveness on high-dimensional robot tasks

Abstract

Monte Carlo methods have become increasingly relevant for control of non-differentiable systems, approximate dynamics models and learning from data. These methods scale to high-dimensional spaces and are effective at the non-convex optimizations often seen in robot learning. We look at sample-based methods from the perspective of inference-based control, specifically posterior policy iteration. From this perspective, we highlight how Gaussian noise priors produce rough control actions that are unsuitable for physical robot deployment. Considering smoother Gaussian process priors, as used in episodic reinforcement learning and motion planning, we demonstrate how smoother model predictive control can be achieved using online sequential inference. This inference is realized through an efficient factorization of the action distribution and a novel means of optimizing the likelihood…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joemwatson/monte-carlo-posterior-policy-iteration
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms

MethodsGaussian Process