Variational Policy Search using Sparse Gaussian Process Priors for   Learning Multimodal Optimal Actions

Hikaru Sasaki; Takamitsu Matsubara

arXiv:2106.07125·cs.RO·June 15, 2021

Variational Policy Search using Sparse Gaussian Process Priors for Learning Multimodal Optimal Actions

Hikaru Sasaki, Takamitsu Matsubara

PDF

Open Access

TL;DR

This paper introduces two novel non-parametric policy search algorithms using sparse Gaussian processes to effectively learn multiple optimal actions in complex robotic tasks, overcoming limitations of unimodal policies.

Contribution

It presents multimodal and mode-seeking sparse Gaussian process policy search methods that handle multiple optimal actions, advancing non-parametric reinforcement learning techniques.

Findings

01

Effective in capturing multiple optimal actions in simulations

02

Improved policy flexibility for complex tasks

03

Demonstrated on object manipulation tasks

Abstract

Policy search reinforcement learning has been drawing much attention as a method of learning a robot control policy. In particular, policy search using such non-parametric policies as Gaussian process regression can learn optimal actions with high-dimensional and redundant sensors as input. However, previous methods implicitly assume that the optimal action becomes unique for each state. This assumption can severely limit such practical applications as robot manipulations since designing a reward function that appears in only one optimal action for complex tasks is difficult. The previous methods might have caused critical performance deterioration because the typical non-parametric policies cannot capture the optimal actions due to their unimodality. We propose novel approaches in non-parametric policy searches with multiple optimal actions and offer two different algorithms commonly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Advanced Control Systems Optimization