Choose your path wisely: gradient descent in a Bregman distance framework
Martin Benning, Marta M. Betcke, Matthias J. Ehrhardt, Carola-Bibiane, Sch\"onlieb

TL;DR
This paper extends gradient descent methods using Bregman distances to non-convex functions, demonstrating improved solutions through a coarse-to-fine approach and applications in imaging and neural networks.
Contribution
It introduces a generalized Bregman distance-based gradient descent framework with proven convergence for non-convex functions satisfying the Kurdyka-iewicz property, and shows its practical advantages.
Findings
Global convergence proven for certain non-convex functions.
Coarse-to-fine feature transition improves solution quality.
Effective in MRI, deconvolution, and neural network classification.
Abstract
We propose an extension of a special form of gradient descent -- in the literature known as linearised Bregman iteration -- to a larger class of non-convex functions. We replace the classical (squared) two norm metric in the gradient descent setting with a generalised Bregman distance, based on a proper, convex and lower semi-continuous function. The algorithm's global convergence is proven for functions that satisfy the Kurdyka-\L ojasiewicz property. Examples illustrate that features of different scale are being introduced throughout the iteration, transitioning from coarse to fine. This coarse-to-fine approach with respect to scale allows to recover solutions of non-convex optimisation problems that are superior to those obtained with conventional gradient descent, or even projected and proximal gradient descent. The effectiveness of the linearised Bregman iteration in combination…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
