Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization
Jun-Kun Wang, Jacob Abernethy

TL;DR
This paper demonstrates that the Heavy Ball method can provably accelerate convergence to benign regions in specific non-convex problems, such as phase retrieval and cubic-regularized minimization, revealing its potential advantages over gradient descent.
Contribution
The paper provides the first theoretical evidence that Heavy Ball momentum can speed up reaching benign regions in certain non-convex optimization problems, with clear dynamics analysis.
Findings
Heavy Ball enters benign regions faster in phase retrieval and cubic-regularized problems.
Larger momentum parameters improve convergence speed.
Heavy Ball exhibits simple, predictable dynamics in these problems.
Abstract
The Heavy Ball Method, proposed by Polyak over five decades ago, is a first-order method for optimizing continuous functions. While its stochastic counterpart has proven extremely popular in training deep networks, there are almost no known functions where deterministic Heavy Ball is provably faster than the simple and classical gradient descent algorithm in non-convex optimization. The success of Heavy Ball has thus far eluded theoretical understanding. Our goal is to address this gap, and in the present work we identify two non-convex problems where we provably show that the Heavy Ball momentum helps the iterate to enter a benign region that contains a global optimal point faster. We show that Heavy Ball exhibits simple dynamics that clearly reveal the benefit of using a larger value of momentum parameter for the problems. The first of these optimization problems is the phase…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Medical Image Segmentation Techniques
