Regularity and stability of feedback relaxed controls
Christoph Reisinger, Yufei Zhang

TL;DR
This paper develops a regularized feedback control framework with exploration rewards for stochastic exit time problems, providing stability, robustness, and convergence results that support reinforcement learning heuristics.
Contribution
It introduces a novel relaxed control regularization approach with exploration rewards, establishing stability, robustness, and convergence properties for stochastic control problems.
Findings
Regularized control admits a Hölder continuous feedback.
Value function and control are Lipschitz stable under perturbations.
Convergence of value functions enables pure exploitation strategies.
Abstract
This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls for multi-dimensional continuous-time stochastic exit time problems. We establish that the regularized control problem admits a H\"{o}lder continuous feedback control, and demonstrate that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations. Moreover, we show that a pre-computed feedback relaxed control has a robust performance in a perturbed system, and derive a first-order sensitivity equation for both the value function and optimal feedback relaxed control. These stability results provide a theoretical justification for recent reinforcement learning heuristics that including an exploration reward in the optimization objective leads to more robust decision making. We finally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Risk and Portfolio Optimization · Adaptive Dynamic Programming Control
