A Measure-Theoretic Finite-Sample Theory for Adaptive-Data Fitted Q-Iteration

Manuel Haussmann; Mustafa Mert \c{C}elikok; Melih Kandemir

arXiv:2605.05791·cs.LG·May 8, 2026

A Measure-Theoretic Finite-Sample Theory for Adaptive-Data Fitted Q-Iteration

Manuel Haussmann, Mustafa Mert \c{C}elikok, Melih Kandemir

PDF

TL;DR

This paper develops a measure-theoretic finite-sample framework for fitted Q-iteration in reinforcement learning, addressing theoretical gaps and providing performance bounds in continuous spaces.

Contribution

It introduces a unified measure-theoretic approach to analyze FQI with finite-sample guarantees and online regret bounds in general measurable spaces.

Findings

01

Finite-sample performance bounds for FQI on general spaces.

02

Sequential Rademacher complexity controls Bellman-regression generalization.

03

First cumulative online regret guarantee for FQI in continuous spaces.

Abstract

While reinforcement learning (RL) promises to revolutionize the control of complex nonlinear robotic systems, a profound gap persists between the heuristic success of model-free off-policy deep RL and the underlying theory, which remains largely confined to tabular or linearizable settings. We identify the cause of this gap as an emergent isolation of three traditions: (i) measure-theoretic MDP foundations on general spaces limit their analysis to exact dynamic programming and ignore all error sources of a learning process; (ii) deterministic error propagation analysis addresses the approximation error via concentrability coefficients without a finite-sample analysis of the estimation error; and (iii) PAC generalization bounds characterize the estimation errors of simplified topologies. We bridge these traditions with a unified theoretical framework for fitted Q-iteration (FQI) on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.