# Behavior of Accelerated Gradient Methods Near Critical Points of   Nonconvex Functions

**Authors:** Michael O'Neill, Stephen J. Wright

arXiv: 1706.07993 · 2018-10-09

## TL;DR

This paper analyzes how accelerated gradient methods, especially the heavy-ball method, behave near saddle points in nonconvex optimization, showing they tend to avoid convergence to saddle points and can diverge faster than gradient descent.

## Contribution

It provides a theoretical analysis demonstrating that accelerated methods are unlikely to converge to strict saddle points and can diverge more rapidly than gradient descent near these points.

## Key findings

- Heavy-ball method unlikely to converge to strict saddle points
- Accelerated methods diverge faster than steepest descent near saddle points
- Stable manifold theorem used to analyze convergence behavior

## Abstract

We examine the behavior of accelerated gradient methods in smooth nonconvex unconstrained optimization, focusing in particular on their behavior near strict saddle points. Accelerated methods are iterative methods that typically step along a direction that is a linear combination of the previous step and the gradient of the function evaluated at a point at or near the current iterate. (The previous step encodes gradient information from earlier stages in the iterative process.) We show by means of the stable manifold theorem that the heavy-ball method method is unlikely to converge to strict saddle points, which are points at which the gradient of the objective is zero but the Hessian has at least one negative eigenvalue. We then examine the behavior of the heavy-ball method and other accelerated gradient methods in the vicinity of a strict saddle point of a nonconvex quadratic function, showing that both methods can diverge from this point more rapidly than the steepest-descent method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.07993/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1706.07993/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/1706.07993/full.md

---
Source: https://tomesphere.com/paper/1706.07993