Variational Policy Search using Sparse Gaussian Process Priors for Learning Multimodal Optimal Actions
Hikaru Sasaki, Takamitsu Matsubara

TL;DR
This paper introduces two novel non-parametric policy search algorithms using sparse Gaussian processes to effectively learn multiple optimal actions in complex robotic tasks, overcoming limitations of unimodal policies.
Contribution
It presents multimodal and mode-seeking sparse Gaussian process policy search methods that handle multiple optimal actions, advancing non-parametric reinforcement learning techniques.
Findings
Effective in capturing multiple optimal actions in simulations
Improved policy flexibility for complex tasks
Demonstrated on object manipulation tasks
Abstract
Policy search reinforcement learning has been drawing much attention as a method of learning a robot control policy. In particular, policy search using such non-parametric policies as Gaussian process regression can learn optimal actions with high-dimensional and redundant sensors as input. However, previous methods implicitly assume that the optimal action becomes unique for each state. This assumption can severely limit such practical applications as robot manipulations since designing a reward function that appears in only one optimal action for complex tasks is difficult. The previous methods might have caused critical performance deterioration because the typical non-parametric policies cannot capture the optimal actions due to their unimodality. We propose novel approaches in non-parametric policy searches with multiple optimal actions and offer two different algorithms commonly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Advanced Control Systems Optimization
