Sample Complexity of Linear Quadratic Regulator Without Initial Stability
Amirreza Neshaei Moghaddam, Alex Olshevsky, Bahman Gharesifard

TL;DR
This paper presents a new receding-horizon algorithm for the LQR problem that does not require initial stability, improves sample complexity, and offers better convergence guarantees through refined analysis.
Contribution
It introduces a novel LQR algorithm that avoids two-point gradient estimates and initial stability assumptions, with enhanced theoretical analysis.
Findings
Achieves comparable sample complexity without initial stability
Provides improved convergence guarantees
Refined analysis of Riccati operator contraction
Abstract
Inspired by REINFORCE, we introduce a novel receding-horizon algorithm for the Linear Quadratic Regulator (LQR) problem with unknown dynamics. Unlike prior methods, our algorithm avoids reliance on two-point gradient estimates while maintaining the same order of sample complexity. Furthermore, it eliminates the restrictive requirement of starting with a stable initial policy, broadening its applicability. Beyond these improvements, we introduce a refined analysis of error propagation through the contraction of the Riccati operator under the Riemannian distance. This refinement leads to a better sample complexity and ensures improved convergence guarantees.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Control of Nonlinear Systems · Cybersecurity and Information Systems · Guidance and Control Systems
MethodsREINFORCE
