Sub-Sampled Newton Methods II: Local Convergence Rates
Farbod Roosta-Khorasani, Michael W. Mahoney

TL;DR
This paper analyzes sub-sampled Newton methods for large-scale optimization, providing local convergence rates and strategies for balancing computational efficiency with convergence speed using Hessian and gradient sub-sampling.
Contribution
It introduces new local convergence results for sub-sampled Newton methods, including Hessian regularization and gradient sub-sampling, with problem-independent rates.
Findings
Establishes locally Q-linear and Q-superlinear convergence rates for Hessian sub-sampling.
Shows R-linear and R-superlinear convergence with gradient sub-sampling and aggressive sample size increase.
Provides problem-independent convergence rates applicable to large-scale convex optimization.
Abstract
Many data-fitting applications require the solution of an optimization problem involving a sum of large number of functions of high dimensional parameter. Here, we consider the problem of minimizing a sum of functions over a convex constraint set where both and are large. In such problems, sub-sampling as a way to reduce can offer great amount of computational efficiency. Within the context of second order methods, we first give quantitative local convergence results for variants of Newton's method where the Hessian is uniformly sub-sampled. Using random matrix concentration inequalities, one can sub-sample in a way that the curvature information is preserved. Using such sub-sampling strategy, we establish locally Q-linear and Q-superlinear convergence rates. We also give additional convergence results for when the sub-sampled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs
