PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating   Scalable Curvature Estimates

Zachary Frangella; Pratik Rathore; Shipu Zhao; and Madeleine Udell

arXiv:2309.02014·math.OC·March 15, 2024

PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates

Zachary Frangella, Pratik Rathore, Shipu Zhao, and Madeleine Udell

PDF

Open Access 1 Repo

TL;DR

This paper presents PROMISE, a set of preconditioned stochastic optimization algorithms with scalable curvature estimates, that outperform traditional methods in large-scale convex machine learning tasks without extensive hyperparameter tuning.

Contribution

It introduces PROMISE algorithms with theoretical guarantees, effective default hyperparameters, and a new notion of quadratic regularity for improved convergence analysis.

Findings

01

PROMISE algorithms outperform or match tuned optimizers on benchmark problems.

02

Theoretical analysis shows linear convergence under quadratic regularity.

03

Default hyperparameters are effective across diverse problems.

Abstract

This paper introduces PROMISE ( $Pr$ econditioned Stochastic $O$ ptimization $M$ ethods by $I$ ncorporating $S$ calable Curvature $E$ stimates), a suite of sketching-based preconditioned stochastic gradient algorithms for solving large-scale convex optimization problems arising in machine learning. PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha; each algorithm comes with a strong theoretical analysis and effective default hyperparameter values. In contrast, traditional stochastic gradient methods require careful hyperparameter tuning to succeed, and degrade in the presence of ill-conditioning, a ubiquitous phenomenon in machine learning. Empirically, we verify the superiority of the proposed algorithms by showing that, using default hyperparameter values, they outperform or match popular tuned stochastic gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

udellgroup/promise
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods

MethodsSAGA · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Logistic Regression