A General Framework for Bounding Approximate Dynamic Programming Schemes
Yajing Liu, Edwin Chong, Ali Pezeshki, Zhenliang Zhang

TL;DR
This paper presents a unified framework for bounding the performance of approximate dynamic programming methods in stochastic settings, using curvature-based bounds on surrogate string optimization problems.
Contribution
It introduces a novel bounding approach that quantifies performance guarantees for ADP schemes without requiring submodularity, based on curvature measures of surrogate objectives.
Findings
Provides bounds on ADP performance in stochastic problems
Relates ADP schemes to greedy solutions in string optimization
Introduces curvature-based bounds independent of submodularity
Abstract
For years, there has been interest in approximation methods for solving dynamic programming problems, because of the inherent complexity in computing optimal solutions characterized by Bellman's principle of optimality. A wide range of approximate dynamic programming (ADP) methods now exists. It is of great interest to guarantee that the performance of an ADP scheme be at least some known fraction, say , of optimal. This paper introduces a general approach to bounding the performance of ADP methods, in this sense, in the stochastic setting. The approach is based on new results for bounding greedy solutions in string optimization problems, where one has to choose a string (ordered set) of actions to maximize an objective function. This bounding technique is inspired by submodularity theory, but submodularity is not required for establishing bounds. Instead, the bounding is based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
