Model-free Value Iteration Algorithm for Continuous-time Stochastic Linear Quadratic Optimal Control Problems
Guangchen Wang, Heng Zhang

TL;DR
This paper introduces a model-free value iteration algorithm for infinite-horizon stochastic linear quadratic control problems, enabling optimal control without prior system stabilization, validated through simulations.
Contribution
It develops a novel model-free VI algorithm for SLQ problems that does not require initial stabilizing control, with proven convergence.
Findings
Algorithm successfully finds optimal control in simulations
No stabilizing control needed to start the algorithm
Proven convergence of the proposed method
Abstract
This paper presents a novel value iteration (VI) algorithm for finding the optimal control for a kind of infinite-horizon stochastic linear quadratic (SLQ) problem with unknown systems. First, an off-line algorithm is estabilished to obtain the optimal feedback control of our problem. Then, based on the off-line algorithm, the VI-based model-free algorithm and its convergence proof is provided. The main feature of the model-free algorithm is that a stabilizing control is not needed to initiate the algorithm. Finally, we validate our results with a simulation example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRisk and Portfolio Optimization · Adaptive Dynamic Programming Control · Advanced Control Systems Optimization
