PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates
Zachary Frangella, Pratik Rathore, Shipu Zhao, and Madeleine Udell

TL;DR
This paper presents PROMISE, a set of preconditioned stochastic optimization algorithms with scalable curvature estimates, that outperform traditional methods in large-scale convex machine learning tasks without extensive hyperparameter tuning.
Contribution
It introduces PROMISE algorithms with theoretical guarantees, effective default hyperparameters, and a new notion of quadratic regularity for improved convergence analysis.
Findings
PROMISE algorithms outperform or match tuned optimizers on benchmark problems.
Theoretical analysis shows linear convergence under quadratic regularity.
Default hyperparameters are effective across diverse problems.
Abstract
This paper introduces PROMISE (econditioned Stochastic ptimization ethods by ncorporating calable Curvature stimates), a suite of sketching-based preconditioned stochastic gradient algorithms for solving large-scale convex optimization problems arising in machine learning. PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha; each algorithm comes with a strong theoretical analysis and effective default hyperparameter values. In contrast, traditional stochastic gradient methods require careful hyperparameter tuning to succeed, and degrade in the presence of ill-conditioning, a ubiquitous phenomenon in machine learning. Empirically, we verify the superiority of the proposed algorithms by showing that, using default hyperparameter values, they outperform or match popular tuned stochastic gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
MethodsSAGA · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Logistic Regression
