Asymptotic and finite-sample properties of estimators based on stochastic gradients
Panos Toulis, Edoardo M. Airoldi

TL;DR
This paper introduces implicit stochastic gradient descent methods that enhance stability and provide a comprehensive theoretical analysis of their asymptotic and finite-sample properties, including efficiency loss, for large-scale parameter estimation.
Contribution
It presents the first full theoretical characterization of implicit stochastic gradient descent estimators, including stability benefits and efficiency analysis, with practical algorithms for various models.
Findings
Implicit procedures improve stability without extra computational cost.
Explicit variance formulas reveal efficiency loss compared to standard methods.
Algorithms developed for generalized linear models and Cox models show practical effectiveness.
Abstract
Stochastic gradient descent procedures have gained popularity for parameter estimation from large data sets. However, their statistical properties are not well understood, in theory. And in practice, avoiding numerical instability requires careful tuning of key parameters. Here, we introduce implicit stochastic gradient descent procedures, which involve parameter updates that are implicitly defined. Intuitively, implicit updates shrink standard stochastic gradient descent updates. The amount of shrinkage depends on the observed Fisher information matrix, which does not need to be explicitly computed; thus, implicit procedures increase stability without increasing the computational burden. Our theoretical analysis provides the first full characterization of the asymptotic behavior of both standard and implicit stochastic gradient descent-based estimators, including finite-sample error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
