Zorro: A Flexible and Differentiable Parametric Family of Activation Functions That Extends ReLU and GELU
Matias Roodschild, Jorge Gotay-Sardi\~nas, Victor A. Jimenez, Adrian, Will

TL;DR
Zorro introduces a new family of smooth, differentiable activation functions that extend ReLU and GELU, offering adaptability and improved training stability across various neural network architectures.
Contribution
The paper presents Zorro, a flexible, differentiable activation family that generalizes ReLU and GELU, addressing their limitations and enhancing neural network training.
Findings
Zorro functions are smooth and adaptable.
They perform well across different architectures.
They approximate popular activation functions like Swish and GELU.
Abstract
Even in recent neural network architectures such as Transformers and Extended LSTM (xLSTM), and traditional ones like Convolutional Neural Networks, Activation Functions are an integral part of nearly all neural networks. They enable more effective training and capture nonlinear data patterns. More than 400 functions have been proposed over the last 30 years, including fixed or trainable parameters, but only a few are widely used. ReLU is one of the most frequently used, with GELU and Swish variants increasingly appearing. However, ReLU presents non-differentiable points and exploding gradient issues, while testing different parameters of GELU and Swish variants produces varying results, needing more parameters to adapt to datasets and architectures. This article introduces a novel set of activation functions called Zorro, a continuously differentiable and flexible family comprising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
Methods(FiLe@Against@Claim)How do I file a claim against Expedia? · Sparse Evolutionary Training · Tanh Activation · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Long Short-Term Memory · Refunds@Expedia|||How do I get a full refund from Expedia?
