A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex   Optimization

Andre Milzarek; Xiantao Xiao; Shicong Cen; Zaiwen Wen; Michael Ulbrich

arXiv:1803.03466·math.OC·March 12, 2018·SIAM J. Optim.

A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization

Andre Milzarek, Xiantao Xiao, Shicong Cen, Zaiwen Wen, Michael Ulbrich

PDF

TL;DR

This paper introduces a globalized stochastic semismooth Newton method for nonsmooth nonconvex optimization, combining stochastic gradient and Hessian information with convergence guarantees and practical efficiency.

Contribution

It develops a hybrid stochastic semismooth Newton approach with convergence analysis and demonstrates its effectiveness on logistic regression and classification tasks.

Findings

01

Converges globally to stationary points in expectation.

02

Achieves local superlinear convergence with high probability.

03

Shows improved efficiency over existing methods in experiments.

Abstract

In this work, we present a globalized stochastic semismooth Newton method for solving stochastic optimization problems involving smooth nonconvex and nonsmooth convex terms in the objective function. We assume that only noisy gradient and Hessian information of the smooth part of the objective function is available via calling stochastic first and second order oracles. The proposed method can be seen as a hybrid approach combining stochastic semismooth Newton steps and stochastic proximal gradient steps. Two inexact growth conditions are incorporated to monitor the convergence and the acceptance of the semismooth Newton steps and it is shown that the algorithm converges globally to stationary points in expectation. Moreover, under standard assumptions and utilizing random matrix concentration inequalities, we prove that the proposed approach locally turns into a pure stochastic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.