ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization
Nhan H. Pham, Lam M. Nguyen, Dzung T. Phan, and Quoc Tran-Dinh

TL;DR
ProxSARAH introduces an efficient stochastic algorithmic framework for solving nonconvex composite optimization problems, achieving optimal complexity bounds and improved practical performance through novel step-size strategies.
Contribution
It presents a new stochastic framework using SARAH estimators with adaptive step-sizes, covering both finite-sum and expectation problems, and improves upon existing methods in complexity and practical efficiency.
Findings
Achieves best-known complexity bounds for nonconvex composite problems.
Demonstrates improved practical performance with larger constant step-sizes.
Validates effectiveness on neural networks and composite nonconvex datasets.
Abstract
We propose a new stochastic first-order algorithmic framework to solve stochastic composite nonconvex optimization problems that covers both finite-sum and expectation settings. Our algorithms rely on the SARAH estimator introduced in (Nguyen et al, 2017) and consist of two steps: a proximal gradient and an averaging step making them different from existing nonconvex proximal-type algorithms. The algorithms only require an average smoothness assumption of the nonconvex objective term and additional bounded variance assumption if applied to expectation problems. They work with both constant and adaptive step-sizes, while allowing single sample and mini-batches. In all these cases, we prove that our algorithms can achieve the best-known complexity bounds. One key step of our methods is new constant and adaptive step-sizes that help to achieve desired complexity bounds while improving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Neural Network Applications
