An Additively Preconditioned Trust Region Strategy for Machine Learning
Samuel Cruz Alegr\'ia, Bindi \c{C}apriqi, Shega Likaj, Ken Trotti, Rolf Krause

TL;DR
This paper introduces a novel additively preconditioned trust region method for large-scale nonconvex optimization in machine learning, leveraging parallel Schwarz corrections to accelerate convergence and reduce hyperparameter tuning.
Contribution
It proposes a new nonlinearly preconditioned trust region algorithm combining additive Schwarz methods with classical trust-region strategies for improved optimization in deep learning.
Findings
Accelerates convergence in large-scale nonconvex problems.
Reduces need for hyperparameter tuning.
Enables parallel local sub-problem solving.
Abstract
Modern machine learning, especially the training of deep neural networks, depends on solving large-scale, highly nonconvex optimization problems, whose objective function exhibit a rough landscape. Motivated by the success of parallel preconditioners in the context of Krylov methods for large scale linear systems, we introduce a novel nonlinearly preconditioned Trust-Region method that makes use of an additive Schwarz correction at each minimization step, thereby accelerating convergence. More precisely, we propose a variant of the Additively Preconditioned Trust-Region Strategy (APTS), which combines a right-preconditioned additive Schwarz framework with a classical Trust-Region algorithm. By decomposing the parameter space into sub-domains, APTS solves local non-linear sub-problems in parallel and assembles their corrections additively. The resulting method not only shows fast…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Matrix Theory and Algorithms · Sparse and Compressive Sensing Techniques
