Ternary Policy Iteration Algorithm for Nonlinear Robust Control

Jie Li; Shengbo Eben Li; Yang Guan; Jingliang Duan; Wenyu Li; Yuming; Yin

arXiv:2007.06810·eess.SY·July 15, 2020·1 cites

Ternary Policy Iteration Algorithm for Nonlinear Robust Control

Jie Li, Shengbo Eben Li, Yang Guan, Jingliang Duan, Wenyu Li, Yuming, Yin

PDF

Open Access

TL;DR

This paper introduces a ternary policy iteration algorithm for nonlinear robust control, formulating the problem as a differential game and demonstrating convergence and disturbance resistance through simulations.

Contribution

The paper presents a novel TPI algorithm that directly updates policies using loss functions derived from the HJI equation, applicable to nonlinear systems with uncertainties.

Findings

01

Converges to optimal solutions for linear plants.

02

Exhibits high disturbance resistance in nonlinear plants.

03

Uses gradient descent for policy updates.

Abstract

The uncertainties in plant dynamics remain a challenge for nonlinear control problems. This paper develops a ternary policy iteration (TPI) algorithm for solving nonlinear robust control problems with bounded uncertainties. The controller and uncertainty of the system are considered as game players, and the robust control problem is formulated as a two-player zero-sum differential game. In order to solve the differential game, the corresponding Hamilton-Jacobi-Isaacs (HJI) equation is then derived. Three loss functions and three update phases are designed to match the identity equation, minimization and maximization of the HJI equation, respectively. These loss functions are defined by the expectation of the approximate Hamiltonian in a generated state set to prevent operating all the states in the entire state set concurrently. The parameters of value function and policies are directly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Frequency Control in Power Systems