Agnostic Reinforcement Learning: Foundations and Algorithms

Gene Li

arXiv:2506.01884·cs.LG·June 3, 2025

Agnostic Reinforcement Learning: Foundations and Algorithms

Gene Li

PDF

Open Access

TL;DR

This paper provides a rigorous theoretical analysis of reinforcement learning with function approximation in large state spaces, focusing on agnostic policy learning without assuming optimal policy within the class.

Contribution

It introduces a comprehensive framework for agnostic policy learning, designing new algorithms with guarantees and establishing fundamental performance bounds.

Findings

01

New algorithms with theoretical guarantees

02

Fundamental performance bounds characterized

03

Revealed limitations and capabilities of agnostic RL

Abstract

Reinforcement Learning (RL) has demonstrated tremendous empirical success across numerous challenging domains. However, we lack a strong theoretical understanding of the statistical complexity of RL in environments with large state spaces, where function approximation is required for sample-efficient learning. This thesis addresses this gap by rigorously examining the statistical complexity of RL with function approximation from a learning theoretic perspective. Departing from a long history of prior work, we consider the weakest form of function approximation, called agnostic policy learning, in which the learner seeks to find the best policy in a given class $Π$ , with no guarantee that $Π$ contains an optimal policy for the underlying task. We systematically explore agnostic policy learning along three key axes: environment access -- how a learner collects data from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Research in Systems and Signal Processing