Gradient Descent Algorithm Survey
Deng Fucheng, Wang Wanjie, Gong Ao, Wang Xiaoqi, Wang Fan

TL;DR
This survey reviews five major gradient descent algorithms used in deep learning, analyzing their advantages, limitations, and practical recommendations to guide effective selection and tuning in various training scenarios.
Contribution
It provides a systematic comparison and practical guidance on five key optimization algorithms, aiding researchers and practitioners in their application and tuning.
Findings
Detailed analysis of SGD, Mini-batch SGD, Momentum, Adam, and Lion.
Practical recommendations for parameter tuning and algorithm selection.
Standardized reference for optimization algorithm application in deep learning.
Abstract
Focusing on the practical configuration needs of optimization algorithms in deep learning, this article concentrates on five major algorithms: SGD, Mini-batch SGD, Momentum, Adam, and Lion. It systematically analyzes the core advantages, limitations, and key practical recommendations of each algorithm. The research aims to gain an in-depth understanding of these algorithms and provide a standardized reference for the reasonable selection, parameter tuning, and performance improvement of optimization algorithms in both academic research and engineering practice, helping to solve optimization challenges in different scales of models and various training scenarios.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Metaheuristic Optimization Algorithms Research · Advanced Neural Network Applications
