Gradient Descent based Optimization Algorithms for Deep Learning Models   Training

Jiawei Zhang

arXiv:1903.03614·cs.LG·March 12, 2019·45 cites

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

Jiawei Zhang

PDF

Open Access

TL;DR

This paper introduces various gradient descent optimization algorithms used in training deep neural networks, highlighting their roles in improving learning performance over traditional methods.

Contribution

It provides an overview of recent gradient descent variants like Momentum, Adagrad, Adam, and Gadam, explaining their differences and applications in deep learning.

Findings

01

Gradient descent variants enhance training efficiency.

02

Different algorithms suit different deep learning scenarios.

03

Most deep learning models still rely on back propagation with these optimizers.

Abstract

In this paper, we aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Deep learning models involving multiple nonlinear projection layers are very challenging to train. Nowadays, most of the deep learning model training still relies on the back propagation algorithm actually. In back propagation, the model variables will be updated iteratively until convergence with gradient descent based optimization algorithms. Besides the conventional vanilla gradient descent algorithm, many gradient descent variants have also been proposed in recent years to improve the learning performance, including Momentum, Adagrad, Adam, Gadam, etc., which will all be introduced in this paper respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Neural Network Applications

MethodsAdam