Activation Functions: Dive into an optimal activation function

Vipul Bansal

arXiv:2202.12065·cs.LG·February 25, 2022·1 cites

Activation Functions: Dive into an optimal activation function

Vipul Bansal

PDF

Open Access

TL;DR

This paper investigates optimizing activation functions in neural networks by combining existing functions and tuning their weights, revealing layer-dependent preferences for ReLU-like or convergent functions across image datasets.

Contribution

It introduces a method to optimize activation functions as weighted sums of existing ones and analyzes their layer-wise preferences in neural networks.

Findings

01

ReLU often dominates in the optimized combination.

02

Initial layers favor ReLU or LeakyReLU, deeper layers prefer convergent functions.

03

Optimized activation functions improve network performance on image datasets.

Abstract

Activation functions have come up as one of the essential components of neural networks. The choice of adequate activation function can impact the accuracy of these methods. In this study, we experiment for finding an optimal activation function by defining it as a weighted sum of existing activation functions and then further optimizing these weights while training the network. The study uses three activation functions, ReLU, tanh, and sin, over three popular image datasets, MNIST, FashionMNIST, and KMNIST. We observe that the ReLU activation function can easily overlook other activation functions. Also, we see that initial layers prefer to have ReLU or LeakyReLU type of activation functions, but deeper layers tend to prefer more convergent activation functions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Advanced Neural Network Applications