Global Optimization for Value Function Approximation
Marek Petrik, Shlomo Zilberstein

TL;DR
This paper introduces a global optimization approach for value function approximation using bilinear programming, providing strong error bounds and analyzing algorithms for solving these NP-hard problems.
Contribution
It presents a novel bilinear programming formulation for value function approximation with theoretical guarantees and practical algorithms, advancing beyond existing methods.
Findings
The approach offers strong a priori error bounds on policy loss.
Algorithms for bilinear programs are analyzed for convergence and approximation.
The method effectively minimizes Bellman residuals on benchmark problems.
Abstract
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze both optimal and approximate algorithms for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. We also briefly analyze the behavior of bilinear programming algorithms under incomplete samples. Finally, we demonstrate that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Advanced Optimization Algorithms Research · Risk and Portfolio Optimization
