Generalized Dual Dynamic Programming for Infinite Horizon Problems in   Continuous State and Action Spaces

Joseph Warrington; Paul N. Beuchat; John Lygeros

arXiv:1711.07222·math.OC·October 5, 2018·IEEE Trans. Autom. Control.

Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces

Joseph Warrington, Paul N. Beuchat, John Lygeros

PDF

TL;DR

This paper introduces a nonlinear generalization of dual dynamic programming for infinite horizon control problems in continuous spaces, providing a method to approximate the value function with guarantees on accuracy.

Contribution

It develops a new nonlinear dual dynamic programming framework with convergence guarantees and practical certification methods for high-dimensional control problems.

Findings

01

Successfully approximates value functions in high-dimensional systems.

02

Provides finite-iteration guarantees for Bellman optimality tolerance.

03

Demonstrates effectiveness on large-scale control systems.

Abstract

We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon setting. We prove, using a Benders-type argument leveraging the monotonicity of the Bellman operator, that the result of a one-stage policy evaluation can be used to produce nonlinear lower bounds on the optimal value function that are valid over the entire state space. These bounds contain terms reflecting the functional form of the system's costs, dynamics, and constraints. We provide an iterative algorithm that produces successively better approximations of the optimal value function, and prove under certain assumptions that it achieves an arbitrarily low desired Bellman optimality tolerance at pre-selected points in the state space, in a finite number…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.