Faster Stochastic Quasi-Newton Methods
Qingsong Zhang, Feihu Huang, Cheng Deng, and Heng Huang

TL;DR
This paper introduces SpiderSQN, a faster stochastic quasi-Newton method that achieves optimal complexity bounds for nonconvex optimization and outperforms existing methods in experiments.
Contribution
The paper proposes SpiderSQN, a novel stochastic quasi-Newton algorithm with optimal complexity and enhanced practical performance through momentum schemes.
Findings
Achieves the best known SFO complexity of O(n + n^{1/2} ε^{-2}) in finite-sum setting.
Matches the best SFO complexity of O(ε^{-3}) in online setting.
Outperforms state-of-the-art methods in benchmark experiments.
Abstract
Stochastic optimization methods have become a class of popular optimization tools in machine learning. Especially, stochastic gradient descent (SGD) has been widely used for machine learning problems such as training neural networks due to low per-iteration computational complexity. In fact, the Newton or quasi-newton methods leveraging second-order information are able to achieve a better solution than the first-order methods. Thus, stochastic quasi-Newton (SQN) methods have been developed to achieve the better solution efficiently than the stochastic first-order methods by utilizing approximate second-order information. However, the existing SQN methods still do not reach the best known stochastic first-order oracle (SFO) complexity. To fill this gap, we propose a novel faster stochastic quasi-Newton method (SpiderSQN) based on the variance reduced technique of SIPDER. We prove that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Tensor decomposition and applications
