SAPPHIRE: Preconditioned Stochastic Variance Reduction for Faster   Large-Scale Statistical Learning

Jingruo Sun; Zachary Frangella; Madeleine Udell

arXiv:2501.15941·stat.ML·January 28, 2025

SAPPHIRE: Preconditioned Stochastic Variance Reduction for Faster Large-Scale Statistical Learning

Jingruo Sun, Zachary Frangella, Madeleine Udell

PDF

Open Access

TL;DR

SAPPHIRE is a novel stochastic variance reduction algorithm that uses sketch-based preconditioning to achieve faster convergence in large-scale, ill-conditioned convex learning problems, outperforming existing methods.

Contribution

The paper introduces SAPPHIRE, a new preconditioned variance-reduced stochastic gradient method that handles ill-conditioning and non-smooth regularizers efficiently.

Findings

01

SAPPHIRE converges up to 20 times faster than existing methods.

02

It achieves linear convergence without dependence on the condition number.

03

The method is robust even with infrequent preconditioner updates.

Abstract

Regularized empirical risk minimization (rERM) has become important in data-intensive fields such as genomics and advertising, with stochastic gradient methods typically used to solve the largest problems. However, ill-conditioned objectives and non-smooth regularizers undermine the performance of traditional stochastic gradient methods, leading to slow convergence and significant computational costs. To address these challenges, we propose the $SAPPHIRE$ ( $S$ ketching-based $A$ pproximations for $P$ roximal $P$ reconditioning and $H$ essian $I$ nexactness with Variance- $RE$ educed Gradients) algorithm, which integrates sketch-based preconditioning to tackle ill-conditioning and uses a scaled proximal mapping to minimize the non-smooth regularizer. This stochastic variance-reduced algorithm achieves condition-number-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference

MethodsLogistic Regression