Unifying Optimization and Dynamics to Parallelize Sequential Computation: A Guide to Parallel Newton Methods for Breaking Sequential Bottlenecks
Xavier Gonzalez

TL;DR
This paper introduces scalable, stable parallel Newton methods for dynamical systems, enabling efficient parallelization of sequential computations in machine learning, with theoretical guarantees based on dynamical stability analysis.
Contribution
It develops novel quasi-Newton and trust-region parallel Newton methods, unifies fixed-point methods under this framework, and characterizes when parallelization accelerates dynamical systems.
Findings
Parallel Newton methods achieve linear convergence under certain conditions.
Stability and efficiency are improved with trust-region approaches.
The sign of the Largest Lyapunov Exponent predicts parallelization success.
Abstract
Massively parallel hardware (GPUs) and long sequence data have made parallel algorithms essential for machine learning at scale. Yet dynamical systems, like recurrent neural networks and Markov chain Monte Carlo, were thought to suffer from sequential bottlenecks. Recent work showed that dynamical systems can in fact be parallelized across the sequence length by reframing their evaluation as a system of nonlinear equations, which can be solved with Newton's method using a parallel associative scan. However, these parallel Newton methods struggled with limitations, primarily inefficiency, instability, and lack of convergence guarantees. This thesis addresses these limitations with methodological and theoretical contributions, drawing particularly from optimization. Methodologically, we develop scalable and stable parallel Newton methods, based on quasi-Newton and trust-region approaches.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Advanced Optimization Algorithms Research
