Review and Comparison of Commonly Used Activation Functions for Deep   Neural Networks

Tomasz Szanda{\l}a

arXiv:2010.09458·cs.LG·October 20, 2020

Review and Comparison of Commonly Used Activation Functions for Deep Neural Networks

Tomasz Szanda{\l}a

PDF

TL;DR

This paper reviews and compares popular activation functions in deep neural networks, analyzing their properties, advantages, disadvantages, and suitable application scenarios to guide better selection.

Contribution

It provides a comprehensive comparison of common activation functions, highlighting their characteristics, limitations, and practical recommendations for neural network design.

Findings

01

ReLU is simple and efficient but can cause dying neurons.

02

Swish offers smooth activation with better performance in some cases.

03

Sigmoid functions can cause vanishing gradients.

Abstract

The primary neural networks decision-making units are activation functions. Moreover, they evaluate the output of networks neural node; thus, they are essential for the performance of the whole network. Hence, it is critical to choose the most appropriate activation function in neural networks calculation. Acharya et al. (2018) suggest that numerous recipes have been formulated over the years, though some of them are considered deprecated these days since they are unable to operate properly under some conditions. These functions have a variety of characteristics, which are deemed essential to successfully learning. Their monotonicity, individual derivatives, and finite of their range are some of these characteristics (Bach 2017). This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth. This will be followed by their properties,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia?