Self-Concordant Analysis of Frank-Wolfe Algorithms
Pavel Dvurechensky, Petr Ostroukhov, Kamil Safin, Shimrit Shtern,, Mathias Staudigl

TL;DR
This paper develops new theoretical guarantees for Frank-Wolfe algorithms applied to self-concordant functions, including adaptive step sizes and linear convergence under certain conditions, enhancing their applicability in machine learning.
Contribution
It introduces a novel analysis of FW methods for self-concordant functions, providing adaptive step sizes and establishing convergence rates, including linear convergence.
Findings
Global convergence rate of O(1/k) for FW on SC functions
A new FW variant with linear convergence under stronger oracle assumptions
Application of SC theory to improve FW algorithm performance
Abstract
Projection-free optimization via different variants of the Frank-Wolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations. If the problem admits a stronger local linear minimization oracle, we construct a novel FW method with linear convergence rate for SC functions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
