Adaptive Blending Units: Trainable Activation Functions for Deep Neural   Networks

Leon Ren\'e S\"utfeld; Flemming Brieger; Holger Finger; Sonja; F\"ullhase; Gordon Pipa

arXiv:1806.10064·cs.LG·June 27, 2018

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

Leon Ren\'e S\"utfeld, Flemming Brieger, Holger Finger, Sonja, F\"ullhase, Gordon Pipa

PDF

TL;DR

This paper introduces Adaptive Blending Units (ABUs), trainable activation functions that learn optimal shapes during training, improving neural network performance by mitigating covariate shifts and adapting dynamically.

Contribution

The paper proposes ABUs, a novel trainable activation function that combines multiple functions and adapts during training, offering a unified approach to activation function selection.

Findings

01

ABUs outperform standard activation functions across various network configurations.

02

Adaptive scaling reduces covariate shifts during training.

03

ABUs' adaptability leads to improved training efficiency and performance.

Abstract

The most widely used activation functions in current deep feed-forward neural networks are rectified linear units (ReLU), and many alternatives have been successfully applied, as well. However, none of the alternatives have managed to consistently outperform the rest and there is no unified theory connecting properties of the task and network with properties of activation functions for most efficient training. A possible solution is to have the network learn its preferred activation functions. In this work, we introduce Adaptive Blending Units (ABUs), a trainable linear combination of a set of activation functions. Since ABUs learn the shape, as well as the overall scaling of the activation function, we also analyze the effects of adaptive scaling in common activation functions. We experimentally demonstrate advantages of both adaptive scaling and ABUs over common activation functions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.