Polytope conditioning and linear convergence of the Frank-Wolfe   algorithm

Javier Pena; Daniel Rodriguez

arXiv:1512.06142·math.OC·December 28, 2016·Math. Oper. Res.

Polytope conditioning and linear convergence of the Frank-Wolfe algorithm

Javier Pena, Daniel Rodriguez

PDF

TL;DR

This paper investigates the linear convergence of the Frank-Wolfe algorithm over polytopes, revealing that various polytope condition measures are equivalent and introducing a unified parameter that influences convergence rates.

Contribution

It unifies different polytope condition measures and introduces a new parameter that explains the linear convergence of Frank-Wolfe over polytopes.

Findings

01

Polytope condition measures are essentially equivalent.

02

A new polytope parameter formalizes the convergence premise.

03

Convergence rate for quadratic objectives depends on a scaled polytope condition number.

Abstract

It is known that the gradient descent algorithm converges linearly when applied to a strongly convex function with Lipschitz gradient. In this case the algorithm's rate of convergence is determined by the condition number of the function. In a similar vein, it has been shown that a variant of the Frank-Wolfe algorithm with away steps converges linearly when applied to a strongly convex function with Lipschitz gradient over a polytope. In a nice extension of the unconstrained case, the algorithm's rate of convergence is determined by the product of the condition number of the function and a certain condition number of the polytope. We shed new light into the latter type of polytope conditioning. In particular, we show that previous and seemingly different approaches to define a suitable condition measure for the polytope are essentially equivalent to each other. Perhaps more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.