Fitted Q-Iteration via Max-Plus-Linear Approximation
Y. Liu, M. A. S. Kolarijani

TL;DR
This paper introduces a novel fitted Q-iteration algorithm using max-plus-linear approximators for offline reinforcement learning, offering provable convergence and reduced computational complexity.
Contribution
It proposes a new FQI method leveraging max-plus-linear regression, with a variational implementation that is computationally efficient and theoretically sound.
Findings
Proves convergence of the max-plus-linear FQI algorithm.
Demonstrates reduced per-iteration complexity independent of sample size.
Shows effectiveness of the approach in offline reinforcement learning scenarios.
Abstract
In this study, we consider the application of max-plus-linear approximators for Q-function in offline reinforcement learning of discounted Markov decision processes. In particular, we incorporate these approximators to propose novel fitted Q-iteration (FQI) algorithms with provable convergence. Exploiting the compatibility of the Bellman operator with max-plus operations, we show that the max-plus-linear regression within each iteration of the proposed FQI algorithm reduces to simple max-plus matrix-vector multiplications. We also consider the variational implementation of the proposed algorithm which leads to a per-iteration complexity that is independent of the number of samples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Advanced Optimization Algorithms Research · Statistical and numerical algorithms
