Adaptive Discretization in Online Reinforcement Learning

Sean R. Sinclair; Siddhartha Banerjee; Christina Lee Yu

arXiv:2110.15843·stat.ML·September 30, 2024

Adaptive Discretization in Online Reinforcement Learning

Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu

PDF

Open Access

TL;DR

This paper introduces a unified theoretical framework for tree-based hierarchical discretization methods in online reinforcement learning, demonstrating how these algorithms adapt to problem structure and provide guarantees that depend on the problem's inherent complexity.

Contribution

The paper offers the first comprehensive theoretical analysis of hierarchical discretization algorithms in online RL, with guarantees that depend on the problem's structure rather than ambient dimension.

Findings

01

Guarantees scale with 'zooming dimension' instead of ambient dimension

02

Algorithms adapt to problem structure automatically

03

Explicit bounds provided for sample complexity, storage, and computation

Abstract

Discretization based approaches to solving online reinforcement learning problems have been studied extensively in practice on applications ranging from resource allocation to cache management. Two major questions in designing discretization-based algorithms are how to create the discretization and when to refine it. While there have been several experimental results investigating heuristic solutions to these questions, there has been little theoretical treatment. In this paper we provide a unified theoretical analysis of tree-based hierarchical partitioning methods for online reinforcement learning, providing model-free and model-based algorithms. We show how our algorithms are able to take advantage of inherent structure of the problem by providing guarantees that scale with respect to the 'zooming dimension' instead of the ambient dimension, an instance-dependent quantity measuring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Search Problems · Advanced Bandit Algorithms Research · Scheduling and Optimization Algorithms