SAU: Smooth activation function using convolution with approximate identities
Koushik Biswas, Sandeep Kumar, Shilpak Banerjee, Ashish Kumar Pandey

TL;DR
This paper introduces the Smooth Activation Unit (SAU), a new smooth approximation of Leaky ReLU created by convolution with approximate identities, leading to improved performance across datasets.
Contribution
The paper presents a novel method for smoothing activation functions via convolution with approximate identities, specifically improving Leaky ReLU with demonstrated empirical benefits.
Findings
SAU outperforms traditional activation functions on multiple datasets.
Replacing ReLU with SAU improves ShuffleNet V2 accuracy by 5.12% on CIFAR100.
SAU provides a differentiable alternative to non-smooth activation functions.
Abstract
Well-known activation functions like ReLU or Leaky ReLU are non-differentiable at the origin. Over the years, many smooth approximations of ReLU have been proposed using various smoothing techniques. We propose new smooth approximations of a non-differentiable activation function by convolving it with approximate identities. In particular, we present smooth approximations of Leaky ReLU and show that they outperform several well-known activation functions in various datasets and models. We call this function Smooth Activation Unit (SAU). Replacing ReLU by SAU, we get 5.12% improvement with ShuffleNet V2 (2.0x) model on CIFAR100 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Grouped Convolution · Groupwise Point Convolution · Batch Normalization · Residual Connection · Average Pooling · Depthwise Convolution · Max Pooling · ShuffleNet Block
