On the Impact of the Activation Function on Deep Neural Networks   Training

Soufiane Hayou; Arnaud Doucet; Judith Rousseau

arXiv:1902.06853·stat.ML·May 28, 2019·69 cites

On the Impact of the Activation Function on Deep Neural Networks Training

Soufiane Hayou, Arnaud Doucet, Judith Rousseau

PDF

Open Access

TL;DR

This paper analyzes how the choice of activation function and initialization parameters affects the training efficiency and performance of deep neural networks, emphasizing the importance of the 'Edge of Chaos' for successful training.

Contribution

It provides a comprehensive theoretical analysis showing how tuning initialization and activation functions can accelerate training and enhance deep neural network performance.

Findings

01

Proper tuning of initialization accelerates training

02

Activation function choice impacts network performance

03

Edge of Chaos is critical for trainability

Abstract

The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is key to identifying which deep networks may be trained successfully as recently demonstrated by Samuel et al (2017) who showed that for deep feedforward neural networks only a specific choice of hyperparameters known as the `Edge of Chaos' can lead to good performance. While the work by Samuel et al (2017) discuss trainability issues, we focus here on training acceleration and overall performance. We give a comprehensive theoretical analysis of the Edge of Chaos and show that we can indeed tune…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications