Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for   Dimension-Dependent Adaptivity

Emmeran Johnson; Ciara Pike-Burke; Patrick Rebeschini

arXiv:2310.01616·cs.LG·May 29, 2024

Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity

Emmeran Johnson, Ciara Pike-Burke, Patrick Rebeschini

PDF

Open Access 1 Video

TL;DR

This paper investigates how the level of adaptivity in multi-batch reinforcement learning affects sample-efficiency, revealing that a certain degree of adaptivity, dependent on problem dimension, is necessary for efficiency.

Contribution

The paper establishes dimension-dependent lower bounds on the number of batches needed for sample-efficient reinforcement learning, highlighting the nuanced role of adaptivity.

Findings

01

Lower bounds of a(a) a(a a a a) on batch numbers for sample efficiency

02

Sample-efficiency requires a minimum adaptivity level depending on problem dimension

03

Adaptivity alone does not guarantee sample-efficiency in reinforcement learning

Abstract

We theoretically explore the relationship between sample-efficiency and adaptivity in reinforcement learning. An algorithm is sample-efficient if it uses a number of queries $n$ to the environment that is polynomial in the dimension $d$ of the problem. Adaptivity refers to the frequency at which queries are sent and feedback is processed to update the querying strategy. To investigate this interplay, we employ a learning framework that allows sending queries in $K$ batches, with feedback being processed and queries updated after each batch. This model encompasses the whole adaptivity spectrum, ranging from non-adaptive 'offline' ( $K = 1$ ) to fully adaptive ( $K = n$ ) scenarios, and regimes in between. For the problems of policy evaluation and best-policy identification under $d$ -dimensional linear function approximation, we establish $Ω (lo g lo g d)$ lower bounds on the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Auction Theory and Applications