An Efficient Policy Iteration Algorithm for Dynamic Programming   Equations

Alessandro Alla; Maurizio Falcone; Dante Kalise

arXiv:1308.2087·math.OC·February 22, 2016·SIAM J. Sci. Comput.

An Efficient Policy Iteration Algorithm for Dynamic Programming Equations

Alessandro Alla, Maurizio Falcone, Dante Kalise

PDF

Open Access

TL;DR

This paper introduces a hybrid algorithm that accelerates solving Hamilton-Jacobi-Bellman equations by combining value iteration and policy iteration, improving efficiency and convergence in optimal control problems.

Contribution

The paper proposes a novel coupling of value and policy iteration methods with an adaptive switching strategy to enhance computational efficiency for static HJB equations.

Findings

01

Effective in dimensions two, three, and four

02

Reduces computation time compared to traditional methods

03

Demonstrates superlinear convergence in relevant cases

Abstract

We present an accelerated algorithm for the solution of static Hamilton-Jacobi-Bellman equations related to optimal control problems. Our scheme is based on a classic policy iteration procedure, which is known to have superlinear convergence in many relevant cases provided the initial guess is sufficiently close to the solution. In many cases, this limitation degenerates into a behavior similar to a value iteration method, with an increased computation time. The new scheme circumvents this problem by combining the advantages of both algorithms with an efficient coupling. The method starts with a value iteration phase and then switches to a policy iteration procedure when a certain error threshold is reached. A delicate point is to determine this threshold in order to avoid cumbersome computation with the value iteration and, at the same time, to be reasonably sure that the policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Optimization and Variational Analysis · Advanced Bandit Algorithms Research