HiGrad: Uncertainty Quantification for Online Learning and Stochastic Approximation
Weijie J. Su, Yuancheng Zhu

TL;DR
This paper introduces HiGrad, a novel hierarchical inference procedure for online learning with stochastic gradient descent, providing statistically valid confidence intervals without extra computational cost.
Contribution
HiGrad is a new hierarchical method that enables statistical inference for SGD-based online learning, using a decorrelation technique and covariance structures for confidence intervals.
Findings
HiGrad achieves asymptotically exact coverage probability.
The method performs well in simulations and real data applications.
An R package 'higrad' is developed for implementation.
Abstract
Stochastic gradient descent (SGD) is an immensely popular approach for online learning in settings where data arrives in a stream or data sizes are very large. However, despite an ever-increasing volume of work on SGD, much less is known about the statistical inferential properties of SGD-based predictions. Taking a fully inferential viewpoint, this paper introduces a novel procedure termed HiGrad to conduct statistical inference for online learning, without incurring additional computational cost compared with SGD. The HiGrad procedure begins by performing SGD updates for a while and then splits the single thread into several threads, and this procedure hierarchically operates in this fashion along each thread. With predictions provided by multiple threads in place, a -based confidence interval is constructed by decorrelating predictions using covariance structures given by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Privacy-Preserving Technologies in Data
MethodsStochastic Gradient Descent
