Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning
Jiyan Yang, Yin-Lam Chow, Christopher R\'e, Michael W. Mahoney

TL;DR
This paper introduces pwSGD, a hybrid algorithm combining randomized linear algebra and stochastic gradient descent to efficiently solve large-scale $\, ext{ell}_p$ regression problems with improved convergence rates and lower computational complexity.
Contribution
The paper proposes pwSGD, a novel hybrid method that integrates RLA preconditioning with weighted SGD, achieving faster convergence for $\, ext{ell}_p$ regression problems compared to existing approaches.
Findings
pwSGD achieves $\, ext{O}( ext{nnz}(A) \,\log n + \text{poly}(d)/\epsilon^2)$ time for $\,\ell_1$ regression.
For $\,\ell_2$ regression, pwSGD attains $\,\text{O}( ext{nnz}(A) \,\log n + \text{poly}(d) \log(1/\epsilon)/\epsilon)$ complexity.
Numerical experiments demonstrate the effectiveness of pwSGD on synthetic and real datasets.
Abstract
In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e.g., and regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system. We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Particularly, when solving regression with size by , pwSGD returns an approximate solution with relative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
