Estimation of Toeplitz Covariance Matrices using Overparameterized Gradient Descent
Daniel Busbib, Ami Wiesel

TL;DR
This paper explores the use of overparameterized gradient descent for Toeplitz covariance matrix estimation, showing that mild overparameterization ensures convergence and can outperform traditional methods.
Contribution
It introduces a simple overparameterized gradient descent approach for Toeplitz covariance estimation, demonstrating theoretical convergence and practical effectiveness.
Findings
Overparameterization (K=2P or 4P) ensures global convergence from random initializations.
When frequencies are fixed, the landscape is benign and stationary points recover the true covariance.
Overparameterized GD matches or exceeds state-of-the-art accuracy in experiments.
Abstract
We consider covariance estimation under Toeplitz structure. Numerous sophisticated optimization methods have been developed to maximize the Gaussian log-likelihood under Toeplitz constraints. In contrast, recent advances in deep learning demonstrate the surprising power of simple gradient descent (GD) applied to overparameterized models. Motivated by this trend, we revisit Toeplitz covariance estimation through the lens of overparameterized GD. We model the covariance as a sum of complex sinusoids with learnable parameters and optimize them via GD. We show that when , GD may converge to suboptimal solutions. However, mild overparameterization ( or ) consistently enables global convergence from random initializations. We further propose an accelerated GD variant with separate learning rates for amplitudes and frequencies. When frequencies are fixed and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques
