# Combining Stochastic Adaptive Cubic Regularization with Negative   Curvature for Nonconvex Optimization

**Authors:** Seonho Park, Seung Hyun Jung, Panos M. Pardalos

arXiv: 1906.11417 · 2019-06-28

## TL;DR

This paper introduces SANC, a novel optimization algorithm that combines negative curvature with adaptive cubic regularization, improving convergence and practicality for large-scale nonconvex machine learning problems.

## Contribution

It proposes the first method integrating negative curvature with adaptive cubic regularization, enhancing efficiency and applicability in large-scale nonconvex optimization.

## Key findings

- SANC effectively escapes saddle points.
- Experimental results show improved convergence on neural network problems.
- The method is practical for large-scale machine learning applications.

## Abstract

We focus on minimizing nonconvex finite-sum functions that typically arise in machine learning problems. In an attempt to solve this problem, the adaptive cubic regularized Newton method has shown its strong global convergence guarantees and ability to escape from strict saddle points. This method uses a trust region-like scheme to determine if an iteration is successful or not, and updates only when it is successful. In this paper, we suggest an algorithm combining negative curvature with the adaptive cubic regularized Newton method to update even at unsuccessful iterations. We call this new method Stochastic Adaptive cubic regularization with Negative Curvature (SANC). Unlike the previous method, in order to attain stochastic gradient and Hessian estimators, the SANC algorithm uses independent sets of data points of consistent size over all iterations. It makes the SANC algorithm more practical to apply for solving large-scale machine learning problems. To the best of our knowledge, this is the first approach that combines the negative curvature method with the adaptive cubic regularized Newton method. Finally, we provide experimental results including neural networks problems supporting the efficiency of our method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.11417/full.md

## Figures

24 figures with captions in the complete paper: https://tomesphere.com/paper/1906.11417/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/1906.11417/full.md

---
Source: https://tomesphere.com/paper/1906.11417