Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction
Ziyang Wei, Wanrong Zhu, Jingyang Lyu, Wei Biao Wu

TL;DR
This paper introduces a new online covariance estimator for SGD that avoids second-order derivatives and offers faster, more accurate convergence through bias reduction.
Contribution
It proposes a fully online, de-biased covariance estimator for SGD that improves accuracy without requiring Hessian information.
Findings
Achieves a convergence rate of n^{(eta-1)/2} sqrt(log n).
Outperforms existing Hessian-free covariance estimators.
Eliminates the need for second-order derivatives.
Abstract
We study online inference and asymptotic covariance estimation for the stochastic gradient descent (SGD) algorithm. While classical methods (such as plug-in and batch-means estimators) are available, they either require inaccessible second-order (Hessian) information or suffer from slow convergence. To address these challenges, we propose a novel, fully online de-biased covariance estimator that eliminates the need for second-order derivatives while significantly improving estimation accuracy. Our method employs a bias-reduction technique to achieve a convergence rate of , outperforming existing Hessian-free alternatives.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
