TL;DR
This paper introduces an improved zeroth-order stochastic Frank-Wolfe algorithm with double variance reduction, significantly enhancing query efficiency for high-dimensional constrained finite-sum optimization in machine learning.
Contribution
It develops a novel double variance reduction framework that reduces gradient approximation and sampling variances, achieving state-of-the-art query complexities without explicit gradient computations.
Findings
Achieves optimal query complexity for convex objectives: O(d √n/ε).
Achieves optimal query complexity for non-convex objectives: O(d^{3/2} √n/ε^2).
Demonstrates superior empirical performance on machine learning tasks.
Abstract
We propose an enhanced zeroth-order stochastic Frank-Wolfe framework to address constrained finite-sum optimization problems, a structure prevalent in large-scale machine-learning applications. Our method introduces a novel double variance reduction framework that effectively reduces the gradient approximation variance induced by zeroth-order oracles and the stochastic sampling variance from finite-sum objectives. By leveraging this framework, our algorithm achieves significant improvements in query efficiency, making it particularly well-suited for high-dimensional optimization tasks. Specifically, for convex objectives, the algorithm achieves a query complexity of O(d \sqrt{n}/\epsilon ) to find an epsilon-suboptimal solution, where d is the dimensionality and n is the number of functions in the finite-sum objective. For non-convex objectives, it achieves a query complexity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
