An Asymptotic Analysis of Minibatch-Based Momentum Methods for Linear Regression Models
Yuan Gao, Xuening Zhu, Haobo Qi, Guodong Li, Riquan Zhang, Hansheng, Wang

TL;DR
This paper provides a theoretical analysis of minibatch-based momentum methods for linear regression, establishing convergence rates, optimal tuning parameters, and conditions for statistical efficiency, supported by numerical experiments.
Contribution
It offers the first comprehensive theoretical analysis of MGDM methods for linear regression, including convergence, optimal tuning, and statistical efficiency conditions.
Findings
Derived the convergence properties of MGDM algorithms.
Identified optimal tuning parameters for faster convergence.
Established conditions for statistical efficiency of MGDM estimators.
Abstract
Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored. To this end, we investigate the theoretical properties of MGDM methods based on the linear regression models. We first study the numerical convergence properties of the MGDM algorithm and further provide the theoretically optimal tuning parameters specification to achieve faster convergence rate. In addition, we explore the relationship between the statistical properties of the resulting MGDM estimator and the tuning parameters. Based on these theoretical findings, we give the conditions for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Image Processing Techniques
