A Significantly Better Class of Activation Functions Than ReLU Like Activation Functions
Mathew Mithra Noel, Yug Oswal

TL;DR
This paper proposes new cone and parabolic-cone activation functions that outperform ReLU-like functions on benchmarks by enabling neurons to divide input space more precisely, leading to higher accuracy and faster training.
Contribution
Introduction of cone and parabolic-cone activation functions that significantly outperform ReLU-like functions on standard benchmarks and enable simpler neural network architectures.
Findings
Outperform ReLU on CIFAR-10 and Imagenette benchmarks.
Allow neurons to learn XOR with a single neuron.
Speed up training due to larger derivatives.
Abstract
This paper introduces a significantly better class of activation functions than the almost universally used ReLU like and Sigmoidal class of activation functions. Two new activation functions referred to as the Cone and Parabolic-Cone that differ drastically from popular activation functions and significantly outperform these on the CIFAR-10 and Imagenette benchmmarks are proposed. The cone activation functions are positive only on a finite interval and are strictly negative except at the end-points of the interval, where they become zero. Thus the set of inputs that produce a positive output for a neuron with cone activation functions is a hyperstrip and not a half-space as is the usual case. Since a hyper strip is the region between two parallel hyper-planes, it allows neurons to more finely divide the input feature space into positive and negative classes than with infinitely wide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training
