TL;DR
This paper establishes the optimality of the $O(1/k)$ convergence rate for Bregman Gradient methods under relative smoothness, and develops a framework to construct worst-case functions for these methods.
Contribution
It proves the $O(1/k)$ rate is optimal for Bregman first-order methods under relative smoothness and extends performance estimation to construct worst-case functions.
Findings
The $O(1/k)$ convergence rate is proven to be optimal.
A constructive method for worst-case functions is developed.
The framework extends to differentiable and strictly convex functions.
Abstract
We provide a lower bound showing that the convergence rate of the NoLips method (a.k.a. Bregman Gradient) is optimal for the class of functions satisfying the -smoothness assumption. This assumption, also known as relative smoothness, appeared in the recent developments around the Bregman Gradient method, where acceleration remained an open issue. On the way, we show how to constructively obtain the corresponding worst-case functions by extending the computer-assisted performance estimation framework of Drori and Teboulle (Mathematical Programming, 2014) to Bregman first-order methods, and to handle the classes of differentiable and strictly convex functions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
