Projection by Convolution: Optimal Sample Complexity for Reinforcement   Learning in Continuous-Space MDPs

Davide Maran; Alberto Maria Metelli; Matteo Papini; Marcello; Restelli

arXiv:2405.06363·cs.LG·May 13, 2024

Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs

Davide Maran, Alberto Maria Metelli, Matteo Papini, Marcello, Restelli

PDF

Open Access

TL;DR

This paper introduces a novel projection technique using harmonic analysis for reinforcement learning in continuous-space MDPs, achieving rate-optimal sample complexity that bridges discretization and low-rank approaches.

Contribution

It presents a simple perturbed least-squares value iteration method with a new harmonic analysis-based projection, achieving optimal sample complexity for smooth Bellman operators in continuous MDPs.

Findings

01

Achieves rate-optimal sample complexity for continuous-space MDPs.

02

Recovers and generalizes existing rates for Lipschitz and low-rank MDPs.

03

Bridges the gap between discretization and low-rank approaches.

Abstract

We consider the problem of learning an $ε$ -optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Given access to a generative model, we achieve rate-optimal sample complexity by performing a simple, \emph{perturbed} version of least-squares value iteration with orthogonal trigonometric polynomials as features. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our~ $O (ϵ^{- 2 - d / (ν + 1)})$ sample complexity, where $d$ is the dimension of the state-action space and $ν$ the order of smoothness, recovers the state-of-the-art result of discretization approaches for the special case of Lipschitz MDPs $(ν = 0)$ . At the same time, for $ν \to \infty$ , it recovers and greatly generalizes the $O (ϵ^{- 2})$ rate of low-rank MDPs, which are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Neural Networks and Applications · VLSI and FPGA Design Techniques