Fast Model-based Policy Search for Universal Policy Networks

Buddhika Laknath Semage; Thommen George Karimpanal; Santu Rana and; Svetha Venkatesh

arXiv:2202.05843·cs.LG·February 15, 2022

Fast Model-based Policy Search for Universal Policy Networks

Buddhika Laknath Semage, Thommen George Karimpanal, Santu Rana and, Svetha Venkatesh

PDF

Open Access

TL;DR

This paper introduces a Gaussian Process prior combined with Bayesian Optimization to efficiently select the best policy from a universal policy network for new environments, improving adaptation in reinforcement learning.

Contribution

It presents a novel method that integrates a GP prior with Bayesian Optimization to enhance policy selection from universal policy networks in unseen environments.

Findings

01

Outperforms baseline methods in continuous control tasks

02

Efficiently identifies suitable policies for new environments

03

Applicable to both continuous and discrete control scenarios

Abstract

Adapting an agent's behaviour to new environments has been one of the primary focus areas of physics based reinforcement learning. Although recent approaches such as universal policy networks partially address this issue by enabling the storage of multiple policies trained in simulation on a wide range of dynamic/latent factors, efficiently identifying the most appropriate policy for a given environment remains a challenge. In this work, we propose a Gaussian Process-based prior learned in simulation, that captures the likely performance of a policy when transferred to a previously unseen environment. We integrate this prior with a Bayesian Optimisation-based policy search process to improve the efficiency of identifying the most appropriate policy from the universal policy network. We empirically evaluate our approach in a range of continuous and discrete control environments, and show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Machine Learning and Data Classification