PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided   Exploration

Yuda Song; Wen Sun

arXiv:2107.07410·cs.LG·July 16, 2021·5 cites

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Yuda Song, Wen Sun

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a model-based RL algorithm that enhances exploration capabilities, guarantees polynomial sample complexity, and performs well across challenging and standard control tasks, including reward-free exploration.

Contribution

It presents a novel, efficient model-based RL algorithm with exploration guarantees applicable to KNR and linear MDPs, outperforming existing methods in exploration tasks.

Findings

01

Successfully handles exploration-challenging control tasks

02

Maintains high performance in dense reward benchmarks

03

Efficient reward-free exploration demonstrated

Abstract

Model-based Reinforcement Learning (RL) is a popular learning paradigm due to its potential sample efficiency compared to model-free RL. However, existing empirical model-based RL approaches lack the ability to explore. This work studies a computationally and statistically efficient model-based algorithm for both Kernelized Nonlinear Regulators (KNR) and linear Markov Decision Processes (MDPs). For both models, our algorithm guarantees polynomial sample complexity and only uses access to a planning oracle. Experimentally, we first demonstrate the flexibility and efficacy of our algorithm on a set of exploration challenging control tasks where existing empirical model-based RL approaches completely fail. We then show that our approach retains excellent performance even in common dense reward control benchmarks that do not require heavy exploration. Finally, we demonstrate that our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yudasong/PCMLP
noneOfficial

Videos

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques