Efficient Model-Free Reinforcement Learning Using Gaussian Process
Ying Fan, Letian Chen, Yizhou Wang

TL;DR
This paper introduces GPPSTD, a model-free reinforcement learning algorithm leveraging Gaussian processes for efficient exploration, combining demonstration data to reduce uncertainty and improve learning in continuous spaces.
Contribution
The paper proposes GPPSTD, a novel Gaussian process-based posterior sampling method that integrates demonstration data to enhance exploration efficiency in model-free RL.
Findings
GPPSTD outperforms traditional methods in continuous state spaces.
Demonstration data reduces uncertainty and accelerates learning.
Theoretical analysis supports improved exploration efficiency.
Abstract
Efficient Reinforcement Learning usually takes advantage of demonstration or good exploration strategy. By applying posterior sampling in model-free RL under the hypothesis of GP, we propose Gaussian Process Posterior Sampling Reinforcement Learning(GPPSTD) algorithm in continuous state space, giving theoretical justifications and empirical results. We also provide theoretical and empirical results that various demonstration could lower expected uncertainty and benefit posterior sampling exploration. In this way, we combined the demonstration and exploration process together to achieve a more efficient reinforcement learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Gaussian Processes and Bayesian Inference
MethodsGaussian Process
