Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning

Jiyan Yang; Yin-Lam Chow; Christopher R\'e; Michael W. Mahoney

arXiv:1502.03571·math.OC·July 11, 2017·J. Mach. Learn. Res.·2 cites

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning

Jiyan Yang, Yin-Lam Chow, Christopher R\'e, Michael W. Mahoney

PDF

Open Access

TL;DR

This paper introduces pwSGD, a hybrid algorithm combining randomized linear algebra and stochastic gradient descent to efficiently solve large-scale $\, ext{ell}_p$ regression problems with improved convergence rates and lower computational complexity.

Contribution

The paper proposes pwSGD, a novel hybrid method that integrates RLA preconditioning with weighted SGD, achieving faster convergence for $\, ext{ell}_p$ regression problems compared to existing approaches.

Findings

01

pwSGD achieves $\, ext{O}( ext{nnz}(A) \,\log n + \text{poly}(d)/\epsilon^2)$ time for $\,\ell_1$ regression.

02

For $\,\ell_2$ regression, pwSGD attains $\,\text{O}( ext{nnz}(A) \,\log n + \text{poly}(d) \log(1/\epsilon)/\epsilon)$ complexity.

03

Numerical experiments demonstrate the effectiveness of pwSGD on synthetic and real datasets.

Abstract

In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e.g., $ℓ_{2}$ and $ℓ_{1}$ regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system. We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Particularly, when solving $ℓ_{1}$ regression with size $n$ by $d$ , pwSGD returns an approximate solution with $ϵ$ relative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods