Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models
Jiaqi Li, Johannes Schmidt-Hieber, Wei Biao Wu

TL;DR
This paper develops an asymptotic theory for dropout-regularized stochastic gradient descent in linear models, providing CLTs and online inference methods with practical confidence intervals.
Contribution
It introduces a novel asymptotic framework for dropout SGD, establishing stationary distributions, CLTs, and efficient online covariance estimation methods.
Findings
CLTs for dropout SGD iterates and their averages
Effective online covariance estimation for inference
Numerical results show confidence intervals achieve nominal coverage
Abstract
This paper proposes an asymptotic theory for online inference of the stochastic gradient descent (SGD) iterates with dropout regularization in linear regression. Specifically, we establish the geometric-moment contraction (GMC) for constant step-size SGD dropout iterates to show the existence of a unique stationary distribution of the dropout recursive function. By the GMC property, we provide quenched central limit theorems (CLT) for the difference between dropout and -regularized iterates, regardless of initialization. The CLT for the difference between the Ruppert-Polyak averaged SGD (ASGD) with dropout and -regularized iterates is also presented. Based on these asymptotic normality results, we further introduce an online estimator for the long-run covariance matrix of ASGD dropout to facilitate inference in a recursive manner with efficiency in computational time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoint processes and geometric inequalities · Stochastic processes and financial applications
MethodsDropout · Stochastic Gradient Descent
