Gradient Descent Learns Linear Dynamical Systems
Moritz Hardt, Tengyu Ma, Benjamin Recht

TL;DR
This paper proves that stochastic gradient descent can efficiently learn unknown linear dynamical systems from noisy data, providing the first polynomial guarantees for this classical problem.
Contribution
It establishes polynomial convergence and sample complexity bounds for gradient descent in linear system identification, a longstanding open problem.
Findings
SGD converges to the global optimum efficiently
Provides polynomial bounds on running time and samples
First such guarantees for this classical problem
Abstract
We prove that stochastic gradient descent efficiently converges to the global optimizer of the maximum likelihood objective of an unknown linear time-invariant dynamical system from a sequence of noisy observations generated by the system. Even though the objective function is non-convex, we provide polynomial running time and sample complexity bounds under strong but natural assumptions. Linear systems identification has been studied for many decades, yet, to the best of our knowledge, these are the first polynomial guarantees for the problem we consider.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
