Activation Function Optimization Scheme for Image Classification

Abdur Rahman; Lu He; Haifeng Wang

arXiv:2409.04915·cs.CV·September 17, 2024

Activation Function Optimization Scheme for Image Classification

Abdur Rahman, Lu He, Haifeng Wang

PDF

1 Repo

TL;DR

This paper introduces an evolutionary optimization approach to discover new activation functions for image classification, resulting in the EELU functions that outperform existing options across various neural networks and datasets.

Contribution

The study presents a novel evolutionary framework for optimizing activation functions, leading to the development of the EELU functions that surpass current state-of-the-art functions in image classification.

Findings

01

EELU functions outperform standard activation functions in 92.8% of tested cases.

02

The optimization scheme successfully discovers activation functions better suited for diverse neural network architectures.

03

The best activation function identified is $-x ext{erf}(e^{-x})$, demonstrating the scheme's effectiveness.

Abstract

Activation function has a significant impact on the dynamics, convergence, and performance of deep neural networks. The search for a consistent and high-performing activation function has always been a pursuit during deep learning model development. Existing state-of-the-art activation functions are manually designed with human expertise except for Swish. Swish was developed using a reinforcement learning-based search strategy. In this study, we propose an evolutionary approach for optimizing activation functions specifically for image classification tasks, aiming to discover functions that outperform current state-of-the-art options. Through this optimization framework, we obtain a series of high-performing activation functions denoted as Exponential Error Linear Unit (EELU). The developed activation functions are evaluated for image classification tasks from two perspectives: (1) five…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

abdurrahman1828/afos
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Sigmoid Activation · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer