Reinforcement Learning with Probabilistically Complete Exploration

Philippe Morere; Gilad Francis; Tom Blau; Fabio Ramos

arXiv:2001.06940·cs.LG·January 22, 2020·5 cites

Reinforcement Learning with Probabilistically Complete Exploration

Philippe Morere, Gilad Francis, Tom Blau, Fabio Ramos

PDF

Open Access

TL;DR

This paper introduces R3L, a reinforcement learning method that uses probabilistic planning algorithms to improve exploration efficiency, achieve faster convergence, and reduce sample complexity in sparse reward environments.

Contribution

The paper presents R3L, a novel approach combining planning algorithms with RL to enhance exploration and provide theoretical guarantees of success.

Findings

01

R3L outperforms classic exploration methods in sample efficiency.

02

R3L achieves faster convergence and better asymptotic performance.

03

Theoretical bounds on exploration success and sampling complexity are provided.

Abstract

Balancing exploration and exploitation remains a key challenge in reinforcement learning (RL). State-of-the-art RL algorithms suffer from high sample complexity, particularly in the sparse reward case, where they can do no better than to explore in all directions until the first positive rewards are found. To mitigate this, we propose Rapidly Randomly-exploring Reinforcement Learning (R3L). We formulate exploration as a search problem and leverage widely-used planning algorithms such as Rapidly-exploring Random Tree (RRT) to find initial solutions. These solutions are used as demonstrations to initialize a policy, then refined by a generic RL algorithm, leading to faster and more stable convergence. We provide theoretical guarantees of R3L exploration finding successful solutions, as well as bounds for its sampling complexity. We experimentally demonstrate the method outperforms classic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Data Stream Mining Techniques