Efficient Model-Free Reinforcement Learning Using Gaussian Process

Ying Fan; Letian Chen; Yizhou Wang

arXiv:1812.04359·cs.LG·December 12, 2018·5 cites

Efficient Model-Free Reinforcement Learning Using Gaussian Process

Ying Fan, Letian Chen, Yizhou Wang

PDF

Open Access

TL;DR

This paper introduces GPPSTD, a model-free reinforcement learning algorithm leveraging Gaussian processes for efficient exploration, combining demonstration data to reduce uncertainty and improve learning in continuous spaces.

Contribution

The paper proposes GPPSTD, a novel Gaussian process-based posterior sampling method that integrates demonstration data to enhance exploration efficiency in model-free RL.

Findings

01

GPPSTD outperforms traditional methods in continuous state spaces.

02

Demonstration data reduces uncertainty and accelerates learning.

03

Theoretical analysis supports improved exploration efficiency.

Abstract

Efficient Reinforcement Learning usually takes advantage of demonstration or good exploration strategy. By applying posterior sampling in model-free RL under the hypothesis of GP, we propose Gaussian Process Posterior Sampling Reinforcement Learning(GPPSTD) algorithm in continuous state space, giving theoretical justifications and empirical results. We also provide theoretical and empirical results that various demonstration could lower expected uncertainty and benefit posterior sampling exploration. In this way, we combined the demonstration and exploration process together to achieve a more efficient reinforcement learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Gaussian Processes and Bayesian Inference

MethodsGaussian Process