A survey of deep learning optimizers -- first and second order methods

Rohan Kashyap

arXiv:2211.15596·cs.LG·September 28, 2023·6 cites

A survey of deep learning optimizers -- first and second order methods

Rohan Kashyap

PDF

Open Access

TL;DR

This paper provides a comprehensive review of 14 optimization methods used in deep learning, analyzing their effectiveness and challenges in high-dimensional, complex loss landscapes.

Contribution

It offers a detailed survey of optimization algorithms and assesses their theoretical and practical challenges in deep learning.

Findings

01

Reviewed 14 optimization methods for deep learning

02

Analyzed difficulties like saddle points and ill-conditioning

03

Provided theoretical insights into optimization challenges

Abstract

Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources. In this paper, we provide a comprehensive review of $14$ standard optimization methods successfully used in deep learning research and a theoretical assessment of the difficulties in numerical optimization from the optimization literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Metaheuristic Optimization Algorithms Research