Regularity and stability of feedback relaxed controls

Christoph Reisinger; Yufei Zhang

arXiv:2001.03148·math.OC·July 26, 2021·5 cites

Regularity and stability of feedback relaxed controls

Christoph Reisinger, Yufei Zhang

PDF

Open Access

TL;DR

This paper develops a regularized feedback control framework with exploration rewards for stochastic exit time problems, providing stability, robustness, and convergence results that support reinforcement learning heuristics.

Contribution

It introduces a novel relaxed control regularization approach with exploration rewards, establishing stability, robustness, and convergence properties for stochastic control problems.

Findings

01

Regularized control admits a Hölder continuous feedback.

02

Value function and control are Lipschitz stable under perturbations.

03

Convergence of value functions enables pure exploitation strategies.

Abstract

This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls for multi-dimensional continuous-time stochastic exit time problems. We establish that the regularized control problem admits a H\"{o}lder continuous feedback control, and demonstrate that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations. Moreover, we show that a pre-computed feedback relaxed control has a robust performance in a perturbed system, and derive a first-order sensitivity equation for both the value function and optimal feedback relaxed control. These stability results provide a theoretical justification for recent reinforcement learning heuristics that including an exploration reward in the optimization objective leads to more robust decision making. We finally…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Risk and Portfolio Optimization · Adaptive Dynamic Programming Control