Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing
Elad Romanov, Fangzhao Zhang, Mert Pilanci

TL;DR
This paper introduces a massively parallel Newton method for convex optimization in serverless environments, using adaptive Hessian sketching based on random matrix theory to efficiently approximate inverse Hessians with convergence guarantees.
Contribution
It proposes a novel adaptive Hessian sketching scheme for distributed Newton methods, leveraging Marchenko-Pastur law for dimension selection and providing non-asymptotic guarantees.
Findings
Dimension-free guarantees for Gaussian sketching matrices.
Effective approximation of Newton steps with low-bias Hessian estimates.
Convergence guarantees for self-concordant objectives with noisy Hessians.
Abstract
Motivated by recent advances in serverless cloud computing, in particular the "function as a service" (FaaS) model, we consider the problem of minimizing a convex function in a massively parallel fashion, where communication between workers is limited. Focusing on the case of a twice-differentiable objective subject to an L2 penalty, we propose a scheme where the central node (server) effectively runs a Newton method, offloading its high per-iteration cost -- stemming from the need to invert the Hessian -- to the workers. In our solution, workers produce independently coarse but low-bias estimates of the inverse Hessian, using an adaptive sketching scheme. The server then averages the descent directions produced by the workers, yielding a good approximation for the exact Newton step. The main component of our adaptive sketching scheme is a low-complexity procedure for selecting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Neural Networks and Reservoir Computing
MethodsNetwork On Network
