Provably Adaptive Average Reward Reinforcement Learning for Metric Spaces

Avik Kar; Rahul Singh

arXiv:2410.19919·cs.LG·February 3, 2026

Provably Adaptive Average Reward Reinforcement Learning for Metric Spaces

Avik Kar, Rahul Singh

PDF

Open Access

TL;DR

This paper introduces ZoRL, an adaptive reinforcement learning algorithm for Lipschitz MDPs that achieves better regret bounds by zooming into promising regions, outperforming fixed discretization methods.

Contribution

The paper develops ZoRL, an adaptive algorithm that improves regret bounds in average-reward RL for Lipschitz MDPs by dynamically discretizing the state-action space.

Findings

01

ZoRL achieves regret bounds of $ ilde{O}(T^{1 - d_{eff}^{-1}})$, improving over fixed discretization methods.

02

ZoRL outperforms state-of-the-art algorithms in experiments, demonstrating the benefits of adaptivity.

03

The algorithm effectively captures problem-specific structure through the zooming dimension, leading to smaller regret in benign MDPs.

Abstract

We study infinite-horizon average-reward reinforcement learning (RL) for Lipschitz MDPs, a broad class that subsumes several important classes such as linear and RKHS MDPs, function approximation frameworks, and develop an adaptive algorithm $ZoRL$ with regret bounded as $O (T^{1 - d_{eff.}^{- 1}})$ , where $d_{eff.} = 2 d_{S} + d_{z} + 3$ , $d_{S}$ is the dimension of the state space and $d_{z}$ is the zooming dimension. In contrast, algorithms with fixed discretization yield $d_{eff.} = 2 (d_{S} + d_{A}) + 2$ , $d_{A}$ being the dimension of action space. $ZoRL$ achieves this by discretizing the state-action space adaptively and zooming into ''promising regions'' of the state-action space. $d_{z}$ , a problem-dependent quantity bounded by the state-action space's dimension, allows us to conclude that if an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Parking Systems Research