A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast
Jongmin Lee, Chanwoo Park, Ernest K. Ryu

TL;DR
This paper uncovers a geometric structure underlying many accelerated optimization methods, leading to novel algorithms that achieve faster convergence rates in reducing gradient norms.
Contribution
It introduces a unifying geometric framework for accelerated methods and proposes new algorithms with improved convergence rates.
Findings
Proposes a geometric structure common to many accelerated methods.
Develops a new accelerated method with $ ext{O}(1/K^4)$ rate.
Demonstrates faster reduction of squared gradient norm.
Abstract
Since Nesterov's seminal 1983 work, many accelerated first-order optimization methods have been proposed, but their analyses lacks a common unifying structure. In this work, we identify a geometric structure satisfied by a wide range of first-order accelerated methods. Using this geometric insight, we present several novel generalizations of accelerated methods. Most interesting among them is a method that reduces the squared gradient norm with rate in the prox-grad setup, faster than the rates of Nesterov's FGM or Kim and Fessler's FPGM-m.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
