Gradient-Bounded Dynamic Programming for Submodular and Concave   Extensible Value Functions with Probabilistic Performance Guarantees

Denis Lebedev; Paul Goulart; Kostas Margellos

arXiv:2006.02910·math.OC·June 5, 2020·Autom.

Gradient-Bounded Dynamic Programming for Submodular and Concave Extensible Value Functions with Probabilistic Performance Guarantees

Denis Lebedev, Paul Goulart, Kostas Margellos

PDF

TL;DR

This paper introduces a new algorithm for high-dimensional stochastic dynamic programming with submodular and concave value functions, providing probabilistic performance guarantees and demonstrating effectiveness in delivery pricing.

Contribution

The paper presents a novel dual dynamic programming algorithm that computes bounds for complex value functions with probabilistic guarantees, addressing the curse of dimensionality.

Findings

01

Algorithm terminates after finite iterations

02

Provides probabilistic guarantees on policy performance

03

Effective in high-dimensional delivery pricing example

Abstract

We consider stochastic dynamic programming problems with high-dimensional, discrete state-spaces and finite, discrete-time horizons that prohibit direct computation of the value function from a given Bellman equation for all states and time steps due to the "curse of dimensionality". For the case where the value function of the dynamic program is concave extensible and submodular in its state-space, we present a new algorithm that computes deterministic upper and stochastic lower bounds of the value function in the realm of dual dynamic programming. We show that the proposed algorithm terminates after a finite number of iterations. Furthermore, we derive probabilistic guarantees on the value accumulated under the associated policy for a single realisation of the dynamic program and for the expectation of this value. Finally, we demonstrate the efficacy of our approach on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.