A Comprehensive Study on Optimization Strategies for Gradient Descent In   Deep Learning

Kaustubh Yadav

arXiv:2101.02397·cs.LG·January 8, 2021

A Comprehensive Study on Optimization Strategies for Gradient Descent In Deep Learning

Kaustubh Yadav

PDF

Open Access

TL;DR

This paper reviews various optimization strategies for gradient descent in deep learning, addressing issues of speed and accuracy, and discusses algorithm architectures and neural network optimization techniques.

Contribution

It provides a comprehensive overview of optimization methods for gradient descent and explores their architectures and enhancements for neural network training.

Findings

01

Different optimization algorithms improve convergence speed.

02

Optimization strategies enhance neural network performance.

03

Discussion of algorithm architectures aids in understanding improvements.

Abstract

One of the most important parts of Artificial Neural Networks is minimizing the loss functions which tells us how good or bad our model is. To minimize these losses we need to tune the weights and biases. Also to calculate the minimum value of a function we need gradient. And to update our weights we need gradient descent. But there are some problems with regular gradient descent ie. it is quite slow and not that accurate. This article aims to give an introduction to optimization strategies to gradient descent. In addition, we shall also discuss the architecture of these algorithms and further optimization of Neural Networks in general

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Algorithms