Frank Wolfe Meets Metric Entropy

Suhas Vijaykumar

arXiv:2205.08634·stat.ML·May 19, 2022

Frank Wolfe Meets Metric Entropy

Suhas Vijaykumar

PDF

Open Access

TL;DR

This paper investigates the limitations of the Frank-Wolfe algorithm in high-dimensional settings, showing that it requires many iterations for certain domains, and links these bounds to metric entropy with implications for statistical algorithms.

Contribution

It introduces a technique to establish lower bounds for Frank-Wolfe based on metric entropy and demonstrates that linear convergence cannot be achieved in average-case high-dimensional scenarios.

Findings

01

Frank-Wolfe requires up to d iterations for certain random polytopes.

02

Dimension-free linear bounds fail in average-case high dimensions.

03

The results have positive implications for statistical algorithms like gradient boosting.

Abstract

The Frank-Wolfe algorithm has seen a resurgence in popularity due to its ability to efficiently solve constrained optimization problems in machine learning and high-dimensional statistics. As such, there is much interest in establishing when the algorithm may possess a "linear" $O (lo g (1/ ϵ))$ dimension-free iteration complexity comparable to projected gradient descent. In this paper, we provide a general technique for establishing domain specific and easy-to-estimate lower bounds for Frank-Wolfe and its variants using the metric entropy of the domain. Most notably, we show that a dimension-free linear upper bound must fail not only in the worst case, but in the \emph{average case}: for a Gaussian or spherical random polytope in $R^{d}$ with $poly (d)$ vertices, Frank-Wolfe requires up to $\tilde{Ω} (d)$ iterations to achieve a $O (1/ d)$ error bound, with high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning