Fitted Q-Iteration via Max-Plus-Linear Approximation

Y. Liu; M. A. S. Kolarijani

arXiv:2409.08422·math.OC·March 11, 2025

Fitted Q-Iteration via Max-Plus-Linear Approximation

Y. Liu, M. A. S. Kolarijani

PDF

Open Access

TL;DR

This paper introduces a novel fitted Q-iteration algorithm using max-plus-linear approximators for offline reinforcement learning, offering provable convergence and reduced computational complexity.

Contribution

It proposes a new FQI method leveraging max-plus-linear regression, with a variational implementation that is computationally efficient and theoretically sound.

Findings

01

Proves convergence of the max-plus-linear FQI algorithm.

02

Demonstrates reduced per-iteration complexity independent of sample size.

03

Shows effectiveness of the approach in offline reinforcement learning scenarios.

Abstract

In this study, we consider the application of max-plus-linear approximators for Q-function in offline reinforcement learning of discounted Markov decision processes. In particular, we incorporate these approximators to propose novel fitted Q-iteration (FQI) algorithms with provable convergence. Exploiting the compatibility of the Bellman operator with max-plus operations, we show that the max-plus-linear regression within each iteration of the proposed FQI algorithm reduces to simple max-plus matrix-vector multiplications. We also consider the variational implementation of the proposed algorithm which leads to a per-iteration complexity that is independent of the number of samples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Advanced Optimization Algorithms Research · Statistical and numerical algorithms