Adaptive Discretization in Online Reinforcement Learning
Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu

TL;DR
This paper introduces a unified theoretical framework for tree-based hierarchical discretization methods in online reinforcement learning, demonstrating how these algorithms adapt to problem structure and provide guarantees that depend on the problem's inherent complexity.
Contribution
The paper offers the first comprehensive theoretical analysis of hierarchical discretization algorithms in online RL, with guarantees that depend on the problem's structure rather than ambient dimension.
Findings
Guarantees scale with 'zooming dimension' instead of ambient dimension
Algorithms adapt to problem structure automatically
Explicit bounds provided for sample complexity, storage, and computation
Abstract
Discretization based approaches to solving online reinforcement learning problems have been studied extensively in practice on applications ranging from resource allocation to cache management. Two major questions in designing discretization-based algorithms are how to create the discretization and when to refine it. While there have been several experimental results investigating heuristic solutions to these questions, there has been little theoretical treatment. In this paper we provide a unified theoretical analysis of tree-based hierarchical partitioning methods for online reinforcement learning, providing model-free and model-based algorithms. We show how our algorithms are able to take advantage of inherent structure of the problem by providing guarantees that scale with respect to the 'zooming dimension' instead of the ambient dimension, an instance-dependent quantity measuring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Advanced Bandit Algorithms Research · Scheduling and Optimization Algorithms
