A survey of deep learning optimizers -- first and second order methods
Rohan Kashyap

TL;DR
This paper provides a comprehensive review of 14 optimization methods used in deep learning, analyzing their effectiveness and challenges in high-dimensional, complex loss landscapes.
Contribution
It offers a detailed survey of optimization algorithms and assesses their theoretical and practical challenges in deep learning.
Findings
Reviewed 14 optimization methods for deep learning
Analyzed difficulties like saddle points and ill-conditioning
Provided theoretical insights into optimization challenges
Abstract
Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources. In this paper, we provide a comprehensive review of standard optimization methods successfully used in deep learning research and a theoretical assessment of the difficulties in numerical optimization from the optimization literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Metaheuristic Optimization Algorithms Research
