GPU-Accelerated Policy Optimization via Batch Automatic Differentiation   of Gaussian Processes for Real-World Control

Abdolreza Taheri; Joni Pajarinen; Reza Ghabcheloo

arXiv:2202.13638·cs.RO·March 1, 2022

GPU-Accelerated Policy Optimization via Batch Automatic Differentiation of Gaussian Processes for Real-World Control

Abdolreza Taheri, Joni Pajarinen, Reza Ghabcheloo

PDF

Open Access

TL;DR

This paper introduces a GPU-accelerated policy optimization method using batch automatic differentiation of Gaussian processes, enabling scalable and efficient control policy training for real-world robotics applications.

Contribution

It presents a novel GPU-based approach that leverages batch sampling and automatic differentiation to significantly speed up Gaussian process policy optimization.

Findings

01

Achieved substantial speedup over exact methods.

02

Demonstrated scalability to larger policies and longer horizons.

03

Handled thousands of trajectories with minimal speed reduction.

Abstract

The ability of Gaussian processes (GPs) to predict the behavior of dynamical systems as a more sample-efficient alternative to parametric models seems promising for real-world robotics research. However, the computational complexity of GPs has made policy search a highly time and memory consuming process that has not been able to scale to larger problems. In this work, we develop a policy optimization method by leveraging fast predictive sampling methods to process batches of trajectories in every forward pass, and compute gradient updates over policy parameters by automatic differentiation of Monte Carlo evaluations, all on GPU. We demonstrate the effectiveness of our approach in training policies on a set of reference-tracking control experiments with a heavy-duty machine. Benchmark results show a significant speedup over exact methods and showcase the scalability of our method to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Advanced Control Systems Optimization

MethodsGreedy Policy Search