# Exploiting Negative Curvature in Deterministic and Stochastic   Optimization

**Authors:** Frank E. Curtis, Daniel P. Robinson

arXiv: 1703.00412 · 2018-04-05

## TL;DR

This paper introduces new algorithms that leverage negative curvature directions in optimization, demonstrating improved performance in both deterministic and stochastic settings through decision-making based on objective function models.

## Contribution

The paper proposes novel frameworks for combining descent and negative curvature steps, enabling fixed stepsizes and applicability to stochastic optimization.

## Key findings

- Deterministic instances show performance gains over descent-only methods.
- In stochastic settings, the methods improve progress when standard stochastic-gradient methods are slow.
- Frameworks are based on upper-bounding models, allowing theoretical and practical advantages.

## Abstract

This paper addresses the question of whether it can be beneficial for an optimization algorithm to follow directions of negative curvature. Although prior work has established convergence results for algorithms that integrate both descent and negative curvature steps, there has not yet been extensive numerical evidence showing that such methods offer consistent performance improvements. In this paper, we present new frameworks for combining descent and negative curvature directions: alternating two-step approaches and dynamic step approaches. The aspect that distinguishes our approaches from ones previously proposed is that they make algorithmic decisions based on (estimated) upper-bounding models of the objective function. A consequence of this aspect is that our frameworks can, in theory, employ fixed stepsizes, which makes the methods readily translatable from deterministic to stochastic settings. For deterministic problems, we show that instances of our dynamic framework yield gains in performance compared to related methods that only follow descent steps. We also show that gains can be made in a stochastic setting in cases when a standard stochastic-gradient-type method might make slow progress.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.00412/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1703.00412/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1703.00412/full.md

---
Source: https://tomesphere.com/paper/1703.00412