Active Coverage for PAC Reinforcement Learning
Aymen Al-Marjani, Andrea Tirinzoni, Emilie Kaufmann

TL;DR
This paper introduces a formal framework for active coverage in reinforcement learning, providing theoretical bounds and a versatile algorithm that improves exploration efficiency and policy identification in various MDPs.
Contribution
It formalizes active coverage in episodic MDPs, derives an instance-dependent lower bound, and proposes CovGame, a nearly optimal algorithm adaptable to multiple PAC RL tasks.
Findings
CovGame nearly matches the lower bound on sample complexity.
The exploration algorithm outperforms minimax bounds in easy-to-explore MDPs.
The combined approach yields a computationally-efficient best-policy identification method.
Abstract
Collecting and leveraging data with good coverage properties plays a crucial role in different aspects of reinforcement learning (RL), including reward-free exploration and offline learning. However, the notion of "good coverage" really depends on the application at hand, as data suitable for one context may not be so for another. In this paper, we formalize the problem of active coverage in episodic Markov decision processes (MDPs), where the goal is to interact with the environment so as to fulfill given sampling requirements. This framework is sufficiently flexible to specify any desired coverage property, making it applicable to any problem that involves online exploration. Our main contribution is an instance-dependent lower bound on the sample complexity of active coverage and a simple game-theoretic algorithm, CovGame, that nearly matches it. We then show that CovGame can be used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Optimization and Search Problems
