Frank-Wolfe variants for minimization of a sum of functions
Suhail M Shah

TL;DR
This paper introduces new variants of the Frank-Wolfe algorithm for minimizing sums of functions, including stochastic, distributed, and incremental gradient-based methods, with proven convergence rates and applications to regression and classification.
Contribution
The paper presents novel Frank-Wolfe variants inspired by dual averaging, distributed schemes, and incremental gradients, with theoretical convergence guarantees.
Findings
Algorithms achieve established convergence rates.
Performance demonstrated on regression and classification tasks.
Distributed and stochastic variants improve scalability.
Abstract
We propose several variants of the Frank-Wolfe algorithm to minimize a sum of functions. The main proposed algorithm is inspired from the dual averaging scheme of Nesterov adapted for Frank Wolfe in a stochastic setting. A distributed version of this scheme is also suggested. Additionally, we propose a Frank-Wolfe variant based on incremental gradient techniques. The convergence rates for all the proposed algorithms are established. The performance is studied on least squares regression and multinomial classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference
