Stochastic Frank-Wolfe Methods for Nonconvex Optimization

Sashank J. Reddi; Suvrit Sra; Barnabas Poczos; Alex Smola

arXiv:1607.08254·math.OC·August 1, 2016

Stochastic Frank-Wolfe Methods for Nonconvex Optimization

Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

PDF

TL;DR

This paper introduces nonconvex stochastic Frank-Wolfe algorithms, providing convergence analysis and variance reduction techniques that improve efficiency for nonconvex optimization problems in machine learning.

Contribution

It develops the first variance reduced nonconvex Frank-Wolfe methods with proven faster convergence rates, extending the applicability of Frank-Wolfe algorithms to nonconvex settings.

Findings

01

Variance reduced methods outperform classical Frank-Wolfe in convergence speed.

02

Proposed algorithms achieve faster convergence in stochastic and finite-sum nonconvex optimization.

03

Theoretical analysis confirms improved convergence rates for the new methods.

Abstract

We study Frank-Wolfe methods for nonconvex stochastic and finite-sum optimization problems. Frank-Wolfe methods (in the convex case) have gained tremendous recent interest in machine learning and optimization communities due to their projection-free property and their ability to exploit structured constraints. However, our understanding of these algorithms in the nonconvex setting is fairly limited. In this paper, we propose nonconvex stochastic Frank-Wolfe methods and analyze their convergence properties. For objective functions that decompose into a finite-sum, we leverage ideas from variance reduction techniques for convex optimization to obtain new variance reduced nonconvex Frank-Wolfe methods that have provably faster convergence than the classical Frank-Wolfe method. Finally, we show that the faster convergence rates of our variance reduced methods also translate into improved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.