Gradient Descent Algorithm Survey

Deng Fucheng; Wang Wanjie; Gong Ao; Wang Xiaoqi; Wang Fan

arXiv:2511.20725·cs.LG·November 27, 2025

Gradient Descent Algorithm Survey

Deng Fucheng, Wang Wanjie, Gong Ao, Wang Xiaoqi, Wang Fan

PDF

Open Access

TL;DR

This survey reviews five major gradient descent algorithms used in deep learning, analyzing their advantages, limitations, and practical recommendations to guide effective selection and tuning in various training scenarios.

Contribution

It provides a systematic comparison and practical guidance on five key optimization algorithms, aiding researchers and practitioners in their application and tuning.

Findings

01

Detailed analysis of SGD, Mini-batch SGD, Momentum, Adam, and Lion.

02

Practical recommendations for parameter tuning and algorithm selection.

03

Standardized reference for optimization algorithm application in deep learning.

Abstract

Focusing on the practical configuration needs of optimization algorithms in deep learning, this article concentrates on five major algorithms: SGD, Mini-batch SGD, Momentum, Adam, and Lion. It systematically analyzes the core advantages, limitations, and key practical recommendations of each algorithm. The research aims to gain an in-depth understanding of these algorithms and provide a standardized reference for the reasonable selection, parameter tuning, and performance improvement of optimization algorithms in both academic research and engineering practice, helping to solve optimization challenges in different scales of models and various training scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Metaheuristic Optimization Algorithms Research · Advanced Neural Network Applications