Projection-Free Adaptive Gradients for Large-Scale Optimization
Cyrille W. Combettes, Christoph Spiegel, Sebastian Pokutta

TL;DR
This paper introduces a projection-free adaptive gradient method for large-scale constrained optimization, improving convergence rates and computational efficiency over existing stochastic Frank-Wolfe algorithms for both convex and nonconvex problems.
Contribution
It develops a novel adaptive gradient-based Frank-Wolfe algorithm that enhances first-order information quality and demonstrates superior theoretical and empirical performance.
Findings
Faster convergence rates compared to previous methods.
Reduced computational complexity in large-scale settings.
Improved empirical performance on constrained optimization tasks.
Abstract
The complexity in large-scale optimization can lie in both handling the objective function and handling the constraint set. In this respect, stochastic Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens, by querying only approximate first-order information from the objective and by maintaining feasibility of the iterates without using projections. In this paper, we improve the quality of their first-order information by blending in adaptive gradients. We derive convergence rates and demonstrate the computational advantage of our method over the state-of-the-art stochastic Frank-Wolfe algorithms on both convex and nonconvex objectives. The experiments further show that our method can improve the performance of adaptive gradient algorithms for constrained optimization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
