Exact Stochastic Second Order Deep Learning
Fares B. Mehouachi, Chaouki Kasmi

TL;DR
This paper introduces an exact stochastic second-order optimization method for deep learning that overcomes traditional computational challenges, enabling more efficient training by leveraging regularization and spectral adjustments.
Contribution
It provides a closed-form formula for the exact stochastic Hessian and Newton direction, addressing non-convexity and promoting flat minima in deep learning optimization.
Findings
The method accurately computes the stochastic Hessian eigenvalues.
It effectively finds the Newton direction in non-convex settings.
Experimental results show improved optimization on popular datasets.
Abstract
Optimization in Deep Learning is mainly dominated by first-order methods which are built around the central concept of backpropagation. Second-order optimization methods, which take into account the second-order derivatives are far less used despite superior theoretical properties. This inadequacy of second-order methods stems from its exorbitant computational cost, poor performance, and the ineluctable non-convex nature of Deep Learning. Several attempts were made to resolve the inadequacy of second-order optimization without reaching a cost-effective solution, much less an exact solution. In this work, we show that this long-standing problem in Deep Learning could be solved in the stochastic case, given a suitable regularization of the neural network. Interestingly, we provide an expression of the stochastic Hessian and its exact eigenvalues. We provide a closed-form formula for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
