Computational issues in Optimization for Deep networks

Corrado Coppola; Lorenzo Papa; Marco Boresta; Irene Amerini; Laura; Palagi

arXiv:2405.02089·math.OC·May 6, 2024

Computational issues in Optimization for Deep networks

Corrado Coppola, Lorenzo Papa, Marco Boresta, Irene Amerini, Laura, Palagi

PDF

Open Access

TL;DR

This paper investigates how different optimization algorithms and hyperparameter settings affect the training and performance of deep neural networks, highlighting robustness issues and the influence of network architecture.

Contribution

It provides a comprehensive analysis of the interaction between optimization algorithms, hyperparameters, and network architecture in deep learning training.

Findings

01

Optimization algorithms show varying robustness to initial conditions.

02

Deeper and wider networks impact optimization performance.

03

Hyperparameter choices significantly influence convergence and accuracy.

Abstract

The paper aims to investigate relevant computational issues of deep neural network architectures with an eye to the interaction between the optimization algorithm and the classification performance. In particular, we aim to analyze the behaviour of state-of-the-art optimization algorithms in relationship to their hyperparameters setting in order to detect robustness with respect to the choice of a certain starting point in ending on different local solutions. We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task. Precisely, we consider several architectures by changing the number of layers and neurons per layer, in order to evaluate the impact of different width and depth structures on the computational optimization performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications