Statistical Inference for Polyak-Ruppert Averaged Zeroth-order   Stochastic Gradient Algorithm

Yanhao Jin; Tesi Xiao; Krishnakumar Balasubramanian

arXiv:2102.05198·stat.ML·November 16, 2021·1 cites

Statistical Inference for Polyak-Ruppert Averaged Zeroth-order Stochastic Gradient Algorithm

Yanhao Jin, Tesi Xiao, Krishnakumar Balasubramanian

PDF

Open Access

TL;DR

This paper develops a statistical inference framework for zeroth-order stochastic gradient algorithms, enabling uncertainty quantification through confidence intervals, which was previously lacking in derivative-free optimization methods.

Contribution

It establishes a central limit theorem for Polyak-Ruppert averaged zeroth-order algorithms and proposes online estimators for asymptotic covariance, facilitating practical confidence set construction.

Findings

01

Proves a central limit theorem for the averaged zeroth-order stochastic gradient algorithm.

02

Provides online estimators for the asymptotic covariance matrix.

03

Enables construction of valid confidence intervals in zeroth-order optimization.

Abstract

Statistical machine learning models trained with stochastic gradient algorithms are increasingly being deployed in critical scientific applications. However, computing the stochastic gradient in several such applications is highly expensive or even impossible at times. In such cases, derivative-free or zeroth-order algorithms are used. An important question which has thus far not been addressed sufficiently in the statistical machine learning literature is that of equipping stochastic zeroth-order algorithms with practical yet rigorous inferential capabilities so that we not only have point estimates or predictions but also quantify the associated uncertainty via confidence intervals or sets. Towards this, in this work, we first establish a central limit theorem for Polyak-Ruppert averaged stochastic zeroth-order gradient algorithm. We then provide online estimators of the asymptotic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques