Beyond Short Steps in Frank-Wolfe Algorithms
David Mart\'inez-Rubio, Sebastian Pokutta

TL;DR
This paper introduces advanced Frank-Wolfe algorithms that leverage function smoothness and optimistic strategies to improve convergence rates and practical stopping criteria, with broader applicability to gradient descent methods.
Contribution
It presents a novel Frank-Wolfe algorithm with an optimistic framework, a generalized short-step strategy, and refined primal-dual convergence analysis, extending beyond traditional methods.
Findings
Optimistic Frank-Wolfe algorithm outperforms existing methods.
Generalized short-step strategy applicable to gradient descent.
Tighter primal-dual convergence rates achieved.
Abstract
We introduce novel techniques to enhance Frank-Wolfe algorithms by leveraging function smoothness beyond traditional short steps. Our study focuses on Frank-Wolfe algorithms with step sizes that incorporate primal-dual guarantees, offering practical stopping criteria. We present a new Frank-Wolfe algorithm utilizing an optimistic framework and provide a primal-dual convergence proof. Additionally, we propose a generalized short-step strategy aimed at optimizing a computable primal-dual gap. Interestingly, this new generalized short-step strategy is also applicable to gradient descent algorithms beyond Frank-Wolfe methods. As a byproduct, our work revisits and refines primal-dual techniques for analyzing Frank-Wolfe algorithms, achieving tighter primal-dual convergence rates. Empirical results demonstrate that our optimistic algorithm outperforms existing methods, highlighting its…
Peer Reviews
Decision·ICLR 2026 Poster
Regardless of the weaknesses mentioned below, I believe this paper, by proposing a new optimistic variant of the Frank-Wolfe method with a convergence guarantee on the computable measure, has its own merit, and warrants further investigation in this direction.
- I agree that the primal-dual gap can serve as a practical stopping criterion, but I don't follow the authors' claim that this justifies the need for a method with a guaranteed convergence rate for that gap. First, the primal-dual gap is not a tight bound on the primal gap $f(x_t) - f(x_*)$, and it can be computed regardless of whether we have a theoretical guarantee on its decrease. Moreover, the paper does not show that existing methods do not efficiently decrease the primal-dual gap (in theo
- The writing is clear and easy to read overall. The ideas of each sections are well presented through selected key inequalities from the proof, which makes a much better reading experience. - The paper includes numerical experiments that support the theoretical results.
See **Questions.**
This submission has conceptual innovation. The introduction of optimism into the FW setting is novel and well-motivated through online learning theory (OMD/FTRL frameworks). The primal-dual short-step idea elegantly unifies step-size selection with primal-dual analysis, offering tighter convergence and computable stopping criteria. The authors presented the rigorous theoretical framework. The analysis is carefully constructed through primal-dual gap bounds, clearly improving on classical FW and
There are no major weakness for this submission. On minor weakness if that this submission has limited experimental scopes. Only convex smooth problems are tested; no constrained stochastic or non-convex scenarios. The proposed primal-dual short steps did not outperform standard heuristics in practice — acknowledged but not deeply analyzed. It is suggested to add more numerical experiments with more experimental scopes like including non-smooth and stochastic benchmarks to demonstrate generalit
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Stochastic Gradient Optimization Techniques · Metaheuristic Optimization Algorithms Research
