Q-Learning under Finite Model Uncertainty

Julian Sester; C\'ecile Decker

arXiv:2407.04259·math.OC·February 10, 2026

Q-Learning under Finite Model Uncertainty

Julian Sester, C\'ecile Decker

PDF

Open Access 1 Repo

TL;DR

This paper introduces a robust Q-learning algorithm for Markov decision processes that handles finite model uncertainty, providing convergence guarantees and flexible uncertainty modeling beyond traditional methods.

Contribution

The paper develops a robust Q-learning method for finite ambiguity sets, extending applicability to various uncertainty models and establishing convergence and error bounds.

Findings

01

Proves almost sure convergence to the robust optimum.

02

Derives non-asymptotic high-probability error bounds.

03

Shows approximation of Wasserstein and parametric sets by finite ambiguity sets.

Abstract

We propose a robust Q-learning algorithm for Markov decision processes under model uncertainty when each state-action pair is associated with a finite ambiguity set of candidate transition kernels. This finite-measure framework enables highly flexible, user-designed uncertainty models and goes beyond the common KL and Wasserstein ball formulations. We establish almost sure convergence of the learned Q-function to the robust optimum, and derive non-asymptotic high-probability error bounds that separate stochastic approximation error from transition-kernel estimation error. Finally, we show that Wasserstein ball and parametric ambiguity sets can be approximated by finite ambiguity sets, allowing our algorithm to be used as a generic solver beyond the finite setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ceciledecker/finiteqlearning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Face and Expression Recognition · Distributed Sensor Networks and Detection Algorithms

MethodsSparse Evolutionary Training