Convergence of machine learning methods for feedback control laws: averaged feedback learning scheme and data driven methods
Karl Kunisch, Donato V\'asquez-Varas

TL;DR
This paper compares two machine learning-based methods for synthesizing optimal feedback control laws, analyzing their convergence properties and performance on problems with varying regularity of the value function.
Contribution
It introduces convergence hypotheses for AFLS and data-driven methods, linking their performance to the regularity of the value function, and demonstrates their connection through optimality conditions.
Findings
Both methods perform similarly on smooth value functions.
AFLS outperforms when the value function is non-differentiable.
Numerical experiments validate the theoretical convergence results.
Abstract
This work addresses the synthesis of optimal feedback control laws via machine learning. In particular, the Averaged Feedback Learning Scheme (AFLS) and a data driven method are considered. Hypotheses for each method ensuring the convergence of the evaluation of the objective function of the underlying control problem at the obtained feedback-laws towards the optimal value function are provided. These hypotheses are connected to the regularity of the value function and the stability of the dynamics. In the case of AFLS these hypotheses only require H\"older continuity of the value function, whereas for the data driven method the value function must be at least . It is demonstrated that these methods are connected via their optimality conditions. Additionally, numerical experiments are provided by applying both methods to a family control problems, parameterized by a positive real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks
