A General Framework for Bounding Approximate Dynamic Programming Schemes

Yajing Liu; Edwin Chong; Ali Pezeshki; Zhenliang Zhang

arXiv:1809.05249·math.OC·June 18, 2020·IEEE Control. Syst. Lett.

A General Framework for Bounding Approximate Dynamic Programming Schemes

Yajing Liu, Edwin Chong, Ali Pezeshki, Zhenliang Zhang

PDF

TL;DR

This paper presents a unified framework for bounding the performance of approximate dynamic programming methods in stochastic settings, using curvature-based bounds on surrogate string optimization problems.

Contribution

It introduces a novel bounding approach that quantifies performance guarantees for ADP schemes without requiring submodularity, based on curvature measures of surrogate objectives.

Findings

01

Provides bounds on ADP performance in stochastic problems

02

Relates ADP schemes to greedy solutions in string optimization

03

Introduces curvature-based bounds independent of submodularity

Abstract

For years, there has been interest in approximation methods for solving dynamic programming problems, because of the inherent complexity in computing optimal solutions characterized by Bellman's principle of optimality. A wide range of approximate dynamic programming (ADP) methods now exists. It is of great interest to guarantee that the performance of an ADP scheme be at least some known fraction, say $β$ , of optimal. This paper introduces a general approach to bounding the performance of ADP methods, in this sense, in the stochastic setting. The approach is based on new results for bounding greedy solutions in string optimization problems, where one has to choose a string (ordered set) of actions to maximize an objective function. This bounding technique is inspired by submodularity theory, but submodularity is not required for establishing bounds. Instead, the bounding is based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.