Computational issues in Optimization for Deep networks
Corrado Coppola, Lorenzo Papa, Marco Boresta, Irene Amerini, Laura, Palagi

TL;DR
This paper investigates how different optimization algorithms and hyperparameter settings affect the training and performance of deep neural networks, highlighting robustness issues and the influence of network architecture.
Contribution
It provides a comprehensive analysis of the interaction between optimization algorithms, hyperparameters, and network architecture in deep learning training.
Findings
Optimization algorithms show varying robustness to initial conditions.
Deeper and wider networks impact optimization performance.
Hyperparameter choices significantly influence convergence and accuracy.
Abstract
The paper aims to investigate relevant computational issues of deep neural network architectures with an eye to the interaction between the optimization algorithm and the classification performance. In particular, we aim to analyze the behaviour of state-of-the-art optimization algorithms in relationship to their hyperparameters setting in order to detect robustness with respect to the choice of a certain starting point in ending on different local solutions. We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task. Precisely, we consider several architectures by changing the number of layers and neurons per layer, in order to evaluate the impact of different width and depth structures on the computational optimization performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
