Anderson acceleration of gradient methods with energy for optimization problems
Hailiang Liu, Jia-Hao He, Xuping Tian

TL;DR
This paper introduces a new optimization algorithm that combines Anderson acceleration with the energy adaptive gradient method, demonstrating faster convergence with minimal hyperparameter tuning.
Contribution
It adapts Anderson acceleration to AEGD, providing a novel optimization method with proven convergence properties and improved convergence speed over traditional gradient methods.
Findings
Accelerated convergence rate by a factor related to Anderson mixing gain.
Requires minimal hyperparameter tuning.
Shows superior convergence speed in experiments.
Abstract
Anderson acceleration (AA) as an efficient technique for speeding up the convergence of fixed-point iterations may be designed for accelerating an optimization method. We propose a novel optimization algorithm by adapting Anderson acceleration to the energy adaptive gradient method (AEGD) [arXiv:2010.05109]. The feasibility of our algorithm is examined in light of convergence results for AEGD, though it is not a fixed-point iteration. We also quantify the accelerated convergence rate of AA for gradient descent by a factor of the gain at each implementation of the Anderson mixing. Our experimental results show that the proposed algorithm requires little tuning of hyperparameters and exhibits superior fast convergence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Matrix Theory and Algorithms · Advanced Optimization Algorithms Research
