Practical Quasi-Newton Methods for Training Deep Neural Networks
Donald Goldfarb, Yi Ren, Achraf Bahamou

TL;DR
This paper introduces practical stochastic quasi-Newton methods, specifically Kronecker-factored block-diagonal BFGS and L-BFGS, for efficient training of deep neural networks by approximating the Hessian with structured, layer-wise Kronecker products.
Contribution
The paper proposes novel Kronecker-factored block-diagonal quasi-Newton methods with a damping strategy, improving training efficiency and performance over existing methods like KFAC and first-order approaches.
Findings
Outperformed or matched KFAC and first-order methods on multiple datasets
Efficiently approximated the Hessian for large-scale neural networks
Demonstrated effectiveness on multi-layer autoencoder models
Abstract
We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs). In DNN training, the number of variables and components of the gradient is often of the order of tens of millions and the Hessian has elements. Consequently, computing and storing a full BFGS approximation or storing a modest number of (step, change in gradient) vector pairs for use in an L-BFGS implementation is out of the question. In our proposed methods, we approximate the Hessian by a block-diagonal matrix and use the structure of the gradient and Hessian to further approximate these blocks, each of which corresponds to a layer, as the Kronecker product of two much smaller matrices. This is analogous to the approach in KFAC, which computes a Kronecker-factored block-diagonal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Matrix Theory and Algorithms
MethodsSolana Customer Service Number +1-833-534-1729
