Estimation of Toeplitz Covariance Matrices using Overparameterized Gradient Descent

Daniel Busbib; Ami Wiesel

arXiv:2511.01605·cs.LG·November 4, 2025

Estimation of Toeplitz Covariance Matrices using Overparameterized Gradient Descent

Daniel Busbib, Ami Wiesel

PDF

Open Access

TL;DR

This paper explores the use of overparameterized gradient descent for Toeplitz covariance matrix estimation, showing that mild overparameterization ensures convergence and can outperform traditional methods.

Contribution

It introduces a simple overparameterized gradient descent approach for Toeplitz covariance estimation, demonstrating theoretical convergence and practical effectiveness.

Findings

01

Overparameterization (K=2P or 4P) ensures global convergence from random initializations.

02

When frequencies are fixed, the landscape is benign and stationary points recover the true covariance.

03

Overparameterized GD matches or exceeds state-of-the-art accuracy in experiments.

Abstract

We consider covariance estimation under Toeplitz structure. Numerous sophisticated optimization methods have been developed to maximize the Gaussian log-likelihood under Toeplitz constraints. In contrast, recent advances in deep learning demonstrate the surprising power of simple gradient descent (GD) applied to overparameterized models. Motivated by this trend, we revisit Toeplitz covariance estimation through the lens of overparameterized GD. We model the $P \times P$ covariance as a sum of $K$ complex sinusoids with learnable parameters and optimize them via GD. We show that when $K = P$ , GD may converge to suboptimal solutions. However, mild overparameterization ( $K = 2 P$ or $4 P$ ) consistently enables global convergence from random initializations. We further propose an accelerated GD variant with separate learning rates for amplitudes and frequencies. When frequencies are fixed and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques