Tuning Stochastic Gradient Algorithms for Statistical Inference via   Large-Sample Asymptotics

Jeffrey Negrea; Jun Yang; Haoyue Feng; Daniel M. Roy; Jonathan H.; Huggins

arXiv:2207.12395·stat.CO·July 21, 2023·1 cites

Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

Jeffrey Negrea, Jun Yang, Haoyue Feng, Daniel M. Roy, Jonathan H., Huggins

PDF

Open Access

TL;DR

This paper develops a theoretical framework for tuning stochastic gradient algorithms using large-sample asymptotics, demonstrating robustness and guiding principles for practical tuning in statistical inference tasks.

Contribution

It introduces a joint step-size and sample-size scaling limit analysis, providing a theoretical basis for tuning SGAs and establishing their asymptotic covariance properties.

Findings

01

Iterate averaging with fixed large step size is robust to tuning.

02

Asymptotic covariance is proportional to MLE sampling distribution.

03

Numerical experiments confirm theoretical predictions.

Abstract

The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference