Model-free stochastic linear quadratic control for discrete-time systems with multiplicative and additive noises via semidefinite programming
Jing Guo, Xiushan Jiang, Weihai Zhang

TL;DR
This paper introduces a novel model-free, non-iterative semidefinite programming approach for stochastic LQR problems in discrete-time systems with multiplicative and additive noises, advancing reinforcement learning theory.
Contribution
It develops a new SDP-based algorithm that estimates optimal control gains directly without initial stabilizers or noise measurements, and analyzes its robustness and theoretical foundations.
Findings
Algorithm achieves single-step solutions without hyper-parameter tuning
Demonstrates effectiveness on inverter system simulations
Provides new insights into Q-learning and reinforcement learning theory
Abstract
This paper investigates a model-free solution to the stochastic linear quadratic regulation (LQR) problem for linear discrete-time systems with both multiplicative and additive noises. We formulate the stochastic LQR problem as a nonconvex optimization problem and rigorously analyze its dual problem structure. By exploiting the inherent convexity of the dual problem and analyzing Karush-Kuhn-Tucker conditions with respect to optimality in convex optimization, we establish an explicit relationship between the optimal point of the dual problem and the parameters of the associated Q-function. This theoretical insight, combined with the technique of the matrix direct sum, makes it possible to develop a novel model-free sample-efficient, non-iterative semidefinite programming algorithm that directly estimates optimal control gain without requiring an initial stabilizing controller, or noises…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Stability and Control of Uncertain Systems · Reinforcement Learning in Robotics
