Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Courtney Paquette; Hongzhou Lin; Dmitriy Drusvyatskiy; Julien Mairal,; Zaid Harchaoui

arXiv:1703.10993·stat.ML·January 3, 2019·23 cites

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal,, Zaid Harchaoui

PDF

Open Access

TL;DR

This paper presents a universal scheme that adapts gradient-based algorithms for non-convex optimization, ensuring convergence to stationary points and automatic acceleration when the problem is convex, with promising experimental results.

Contribution

It introduces a generic, adaptive scheme that extends convex optimization algorithms to weakly convex non-convex functions without prior knowledge of convexity.

Findings

01

Guarantees convergence to stationary points with first-order efficiency.

02

Automatically accelerates for convex objectives, achieving near-optimal rates.

03

Demonstrates effectiveness on neural network training and matrix factorization.

Abstract

We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed approach allows one to use them on weakly convex objectives, which covers a large class of non-convex functions typically appearing in machine learning and signal processing. In general, the scheme is guaranteed to produce a stationary point with a worst-case efficiency typical of first-order methods, and when the objective turns out to be convex, it automatically accelerates in the sense of Nesterov and achieves near-optimal convergence rate in function values. These properties are achieved without assuming any knowledge about the convexity of the objective, by automatically adapting to the unknown weak convexity constant. We conclude the paper by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Numerical methods in inverse problems

MethodsSAGA