The Effects of Hyperparameters on SGD Training of Neural Networks
Thomas M. Breuel

TL;DR
This paper presents large-scale experiments analyzing how hyperparameters like learning rate, batch size, and depth affect neural network training performance, highlighting their interactions and optimization challenges.
Contribution
It provides comprehensive experimental insights into hyperparameter effects and interactions, addressing gaps in previous limited explorations.
Findings
Hyperparameters significantly influence training outcomes.
Interactions between hyperparameters affect performance.
Optimization of hyperparameters remains complex and context-dependent.
Abstract
The performance of neural network classifiers is determined by a number of hyperparameters, including learning rate, batch size, and depth. A number of attempts have been made to explore these parameters in the literature, and at times, to develop methods for optimizing them. However, exploration of parameter spaces has often been limited. In this note, I report the results of large scale experiments exploring these different parameters and their interactions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Advanced Neural Network Applications
