Adaptive Multi-Goal Exploration

Jean Tarbouriech; Omar Darwiche Domingues; Pierre M\'enard; Matteo; Pirotta; Michal Valko; Alessandro Lazaric

arXiv:2111.12045·cs.LG·February 25, 2022

Adaptive Multi-Goal Exploration

Jean Tarbouriech, Omar Darwiche Domingues, Pierre M\'enard, Matteo, Pirotta, Michal Valko, Alessandro Lazaric

PDF

Open Access 1 Datasets

TL;DR

This paper presents AdaGoal, a novel goal selection strategy for efficient multi-goal exploration in reinforcement learning, with strong theoretical guarantees and practical deep learning applications.

Contribution

It introduces AdaGoal, a new goal selection scheme that adaptively targets goals based on uncertainty, providing near-optimal exploration guarantees and extending to deep RL.

Findings

01

Achieves near-minimax optimal exploration in tabular MDPs.

02

Provides the first goal-oriented PAC guarantee with linear function approximation.

03

Demonstrates effectiveness in goal-conditioned deep reinforcement learning.

Abstract

We introduce a generic strategy for provably efficient multi-goal exploration. It relies on AdaGoal, a novel goal selection scheme that leverages a measure of uncertainty in reaching states to adaptively target goals that are neither too difficult nor too easy. We show how AdaGoal can be used to tackle the objective of learning an $ϵ$ -optimal goal-conditioned policy for the (initially unknown) set of goal states that are reachable within $L$ steps in expectation from a reference state $s_{0}$ in a reward-free Markov decision process. In the tabular case with $S$ states and $A$ actions, our algorithm requires $\tilde{O} (L^{3} S A ϵ^{- 2})$ exploration steps, which is nearly minimax optimal. We also readily instantiate AdaGoal in linear mixture Markov decision processes, yielding the first goal-oriented PAC guarantee with linear function approximation. Beyond its strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

misovalko/my-research-papers
dataset· 21 dl
21 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Smart Grid Energy Management