A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization

Zhize Li; Jian Li

arXiv:1809.02341·math.OC·March 3, 2020·1 cites

A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization

Zhize Li, Jian Li

PDF

Open Access

TL;DR

This paper introduces a Chebyshev polynomial-based Anderson acceleration method that achieves optimal convergence rates for nonlinear optimization, outperforming existing methods and including a dynamic hyperparameter guessing algorithm.

Contribution

It presents a novel Anderson-Chebyshev acceleration technique with proven optimal convergence rates for nonlinear problems, along with a dynamic hyperparameter guessing algorithm.

Findings

01

Achieves optimal convergence rate of O(√κ log(1/ε)) for quadratic functions.

02

Demonstrates significantly faster convergence than gradient descent and Nesterov's methods.

03

Dynamic hyperparameter guessing improves practical performance without prior parameter knowledge.

Abstract

Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t + 1} = G (x_{t})$ , e.g., gradient descent can be viewed as iteratively applying the operation $G (x) ≜ x - α \nabla f (x)$ . It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O (κ ln \frac{1}{ϵ})$ , which improves the previous result $O (κ ln \frac{1}{ϵ})$ provided by (Toth and Kelley, 2015) for quadratic functions. Moreover, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$ ) are not available, we propose a guessing algorithm for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Advanced Optimization Algorithms Research · Stochastic Gradient Optimization Techniques