Q-Learning under Finite Model Uncertainty
Julian Sester, C\'ecile Decker

TL;DR
This paper introduces a robust Q-learning algorithm for Markov decision processes that handles finite model uncertainty, providing convergence guarantees and flexible uncertainty modeling beyond traditional methods.
Contribution
The paper develops a robust Q-learning method for finite ambiguity sets, extending applicability to various uncertainty models and establishing convergence and error bounds.
Findings
Proves almost sure convergence to the robust optimum.
Derives non-asymptotic high-probability error bounds.
Shows approximation of Wasserstein and parametric sets by finite ambiguity sets.
Abstract
We propose a robust Q-learning algorithm for Markov decision processes under model uncertainty when each state-action pair is associated with a finite ambiguity set of candidate transition kernels. This finite-measure framework enables highly flexible, user-designed uncertainty models and goes beyond the common KL and Wasserstein ball formulations. We establish almost sure convergence of the learned Q-function to the robust optimum, and derive non-asymptotic high-probability error bounds that separate stochastic approximation error from transition-kernel estimation error. Finally, we show that Wasserstein ball and parametric ambiguity sets can be approximated by finite ambiguity sets, allowing our algorithm to be used as a generic solver beyond the finite setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Face and Expression Recognition · Distributed Sensor Networks and Detection Algorithms
MethodsSparse Evolutionary Training
