Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arn\'e Clevert, Thomas Unterthiner, Sepp Hochreiter

TL;DR
The paper introduces ELUs, a new activation function that accelerates learning and improves accuracy in deep neural networks by reducing mean shifts and enhancing robustness, outperforming ReLUs and other variants.
Contribution
The authors propose the exponential linear unit (ELU), a novel activation function that improves learning speed and accuracy in deep networks compared to existing functions like ReLU.
Findings
ELUs lead to faster learning and better generalization.
ELU networks outperform ReLU networks on CIFAR-100 and ImageNet.
ELU achieves top results on CIFAR-10 and CIFAR-100 datasets.
Abstract
We introduce the "exponential linear unit" (ELU) which speeds up learning in deep neural networks and leads to higher classification accuracies. Like rectified linear units (ReLUs), leaky ReLUs (LReLUs) and parametrized ReLUs (PReLUs), ELUs alleviate the vanishing gradient problem via the identity for positive values. However, ELUs have improved learning characteristics compared to the units with other activation functions. In contrast to ReLUs, ELUs have negative values which allows them to push mean unit activations closer to zero like batch normalization but with lower computational complexity. Mean shifts toward zero speed up learning by bringing the normal gradient closer to the unit natural gradient because of a reduced bias shift effect. While LReLUs and PReLUs have negative values, too, they do not ensure a noise-robust deactivation state. ELUs saturate to a negative value with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · *Communicated@Fast*How Do I Communicate to Expedia? · Exponential Linear Unit · Batch Normalization
