Contraction-Aligned Analysis of Soft Bellman Residual Minimization with Weighted Lp-Norm for Markov Decision Problem
Hyukjun Yang, Han-Dong Lim, Donghwan Lee

TL;DR
This paper introduces a soft Bellman residual minimization approach using weighted Lp-norms that aligns with the contraction properties of the Bellman operator, improving error control in Markov decision processes.
Contribution
It extends residual minimization to a generalized weighted Lp-norm, connecting it with Bellman contraction geometry for better optimization and error bounds.
Findings
Alignment of residual minimization with Bellman contraction as p increases
Derivation of performance error bounds for the proposed method
Enhanced control of error propagation in policy evaluation
Abstract
The problem of solving Markov decision processes under function approximation remains a fundamental challenge, even under linear function approximation settings. A key difficulty arises from a geometric mismatch: while the Bellman optimality operator is contractive in the Linfty-norm, commonly used objectives such as projected value iteration and Bellman residual minimization rely on L2-based formulations. To enable gradient-based optimization, we consider a soft formulation of Bellman residual minimization and extend it to a generalized weighted Lp -norm. We show that this formulation aligns the optimization objective with the contraction geometry of the Bellman operator as p increases, and derive corresponding performance error bounds. Our analysis provides a principled connection between residual minimization and Bellman contraction, leading to improved control of error propagation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
