Online Covariance Estimation in Averaged SGD: Improved Batch-Mean Rates and Minimax Optimality via Trajectory Regression

Yijin Ni; Xiaoming Huo

arXiv:2604.10814·cs.LG·April 14, 2026

Online Covariance Estimation in Averaged SGD: Improved Batch-Mean Rates and Minimax Optimality via Trajectory Regression

Yijin Ni, Xiaoming Huo

PDF

TL;DR

This paper improves online covariance estimation for averaged SGD, achieving minimax optimal rates without Hessian access by analyzing trajectory regression and bias components.

Contribution

It introduces a bias-tuned batch-means estimator and a trajectory regression method that attain minimax optimal covariance estimation rates in an online setting.

Findings

01

Re-tuning block-growth improves convergence rate to O(n^{-(1-eta)/3})

02

Trajectory regression achieves the minimax rate matching the lower bound

03

The modified estimator requires no Hessian access and maintains O(d^2) memory

Abstract

We study online covariance matrix estimation for Polyak--Ruppert averaged stochastic gradient descent (SGD). The online batch-means estimator of Zhu, Chen and Wu (2023) achieves an operator-norm convergence rate of $O (n^{- (1 - α) /4})$ , which yields $O (n^{- 1/8})$ at the optimal learning-rate exponent $α \to 1/ 2^{+}$ . A rigorous per-block bias analysis reveals that re-tuning the block-growth parameter improves the batch-means rate to $O (n^{- (1 - α) /3})$ , achieving $O (n^{- 1/6})$ . The modified estimator requires no Hessian access and preserves $O (d^{2})$ memory. We provide a complete error decomposition into variance, stationarity bias, and nonlinearity bias components. A weighted-averaging variant that avoids hard truncation is also discussed. We establish the minimax rate $Θ (n^{- (1 - α) /2})$ for Hessian-free covariance estimation from the SGD trajectory: a Le Cam…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.