Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity
Emmeran Johnson, Ciara Pike-Burke, Patrick Rebeschini

TL;DR
This paper investigates how the level of adaptivity in multi-batch reinforcement learning affects sample-efficiency, revealing that a certain degree of adaptivity, dependent on problem dimension, is necessary for efficiency.
Contribution
The paper establishes dimension-dependent lower bounds on the number of batches needed for sample-efficient reinforcement learning, highlighting the nuanced role of adaptivity.
Findings
Lower bounds of a(a) a(a a a a) on batch numbers for sample efficiency
Sample-efficiency requires a minimum adaptivity level depending on problem dimension
Adaptivity alone does not guarantee sample-efficiency in reinforcement learning
Abstract
We theoretically explore the relationship between sample-efficiency and adaptivity in reinforcement learning. An algorithm is sample-efficient if it uses a number of queries to the environment that is polynomial in the dimension of the problem. Adaptivity refers to the frequency at which queries are sent and feedback is processed to update the querying strategy. To investigate this interplay, we employ a learning framework that allows sending queries in batches, with feedback being processed and queries updated after each batch. This model encompasses the whole adaptivity spectrum, ranging from non-adaptive 'offline' () to fully adaptive () scenarios, and regimes in between. For the problems of policy evaluation and best-policy identification under -dimensional linear function approximation, we establish lower bounds on the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Auction Theory and Applications
