Second-Order Stochastic Optimization for Machine Learning in Linear Time

Naman Agarwal; Brian Bullins; Elad Hazan

arXiv:1602.03943·stat.ML·December 1, 2017·42 cites

Second-Order Stochastic Optimization for Machine Learning in Linear Time

Naman Agarwal, Brian Bullins, Elad Hazan

PDF

Open Access 4 Repos

TL;DR

This paper introduces second-order stochastic optimization methods that achieve faster convergence and linear time complexity, making them practical for large-scale machine learning tasks.

Contribution

The authors develop second-order stochastic algorithms with per-iteration costs comparable to first-order methods, improving overall efficiency in certain settings.

Findings

01

Achieve faster convergence than first-order methods

02

Maintain linear time complexity relative to data sparsity

03

Applicable to large-scale machine learning problems

Abstract

First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored due to the high cost of computing the second-order information. In this paper we develop second-order stochastic methods for optimization problems in machine learning that match the per-iteration cost of gradient based methods, and in certain settings improve upon the overall running time over popular first-order methods. Furthermore, our algorithm has the desirable property of being implementable in time linear in the sparsity of the input data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research