Convergence Analysis of Proximal Gradient with Momentum for Nonconvex   Optimization

Qunwei Li; Yi Zhou; Yingbin Liang; Pramod K. Varshney

arXiv:1705.04925·cs.LG·May 16, 2017·37 cites

Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization

Qunwei Li, Yi Zhou, Yingbin Liang, Pramod K. Varshney

PDF

Open Access

TL;DR

This paper analyzes the convergence of an accelerated proximal gradient method with momentum for nonconvex optimization, providing theoretical guarantees and proposing stochastic and adaptive variants to improve performance.

Contribution

It offers a rigorous convergence analysis of APGnc for nonconvex problems, introduces stochastic and inexact variants, and develops an adaptive momentum strategy.

Findings

01

Limit points are critical points of the objective.

02

Establishes linear and sub-linear convergence rates.

03

Proposes stochastic variance reduced and adaptive momentum methods.

Abstract

In many modern machine learning applications, structures of underlying mathematical models often yield nonconvex optimization problems. Due to the intractability of nonconvexity, there is a rising need to develop efficient methods for solving general nonconvex problems with certain performance guarantee. In this work, we investigate the accelerated proximal gradient method for nonconvex programming (APGnc). The method compares between a usual proximal gradient step and a linear extrapolation step, and accepts the one that has a lower function value to achieve a monotonic decrease. In specific, under a general nonsmooth and nonconvex setting, we provide a rigorous argument to show that the limit points of the sequence generated by APGnc are critical points of the objective function. Then, by exploiting the Kurdyka-{\L}ojasiewicz (\KL) property for a broad class of functions, we establish…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Optimization and Variational Analysis