The Effects of Hyperparameters on SGD Training of Neural Networks

Thomas M. Breuel

arXiv:1508.02788·cs.NE·August 13, 2015·52 cites

The Effects of Hyperparameters on SGD Training of Neural Networks

Thomas M. Breuel

PDF

Open Access

TL;DR

This paper presents large-scale experiments analyzing how hyperparameters like learning rate, batch size, and depth affect neural network training performance, highlighting their interactions and optimization challenges.

Contribution

It provides comprehensive experimental insights into hyperparameter effects and interactions, addressing gaps in previous limited explorations.

Findings

01

Hyperparameters significantly influence training outcomes.

02

Interactions between hyperparameters affect performance.

03

Optimization of hyperparameters remains complex and context-dependent.

Abstract

The performance of neural network classifiers is determined by a number of hyperparameters, including learning rate, batch size, and depth. A number of attempts have been made to explore these parameters in the literature, and at times, to develop methods for optimizing them. However, exploration of parameter spaces has often been limited. In this note, I report the results of large scale experiments exploring these different parameters and their interactions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Neural Networks and Applications · Advanced Neural Network Applications