A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex Optimization
Minghan Yang, Andre Milzarek, Zaiwen Wen, Tong Zhang

TL;DR
This paper introduces a stochastic extra-step quasi-Newton method tailored for nonsmooth nonconvex optimization, combining stochastic higher order and proximal gradient steps, with proven convergence and practical effectiveness in large-scale applications.
Contribution
It develops a novel stochastic quasi-Newton algorithm with convergence guarantees for nonsmooth nonconvex problems, incorporating variance reduction and coordinate-type schemes.
Findings
Converges to stationary points in expectation under suitable step size bounds.
Effective in large-scale logistic regression and deep learning tasks.
Outperforms several state-of-the-art methods in experiments.
Abstract
In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a class of nonsmooth nonconvex composite optimization problems. We assume that the gradient of the smooth part of the objective function can only be approximated by stochastic oracles. The proposed method combines general stochastic higher order steps derived from an underlying proximal type fixed-point equation with additional stochastic proximal gradient steps to guarantee convergence. Based on suitable bounds on the step sizes, we establish global convergence to stationary points in expectation and an extension of the approach using variance reduction techniques is discussed. Motivated by large-scale and big data applications, we investigate a stochastic coordinate-type quasi-Newton scheme that allows to generate cheap and tractable stochastic higher order directions. Finally, the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
MethodsLogistic Regression
