Geometric Monomial (GEM): a family of rational 2N-differentiable activation functions

Eylon E. Krause

arXiv:2604.21677·cs.LG·April 24, 2026

Geometric Monomial (GEM): a family of rational 2N-differentiable activation functions

Eylon E. Krause

PDF

TL;DR

This paper introduces a family of smooth, rational activation functions called GEM, which outperform traditional activations like GELU across various neural network architectures and tasks.

Contribution

The authors propose a novel family of $C^{2N}$-smooth rational activation functions with variants that improve deep neural network training and performance.

Findings

01

GEM with $N=1$ reduces the GELU deficit on CIFAR-100 + ResNet-56.

02

SE-GEM surpasses GELU on CIFAR-10 + ResNet-56.

03

E-GEM reduces the GELU deficit on CIFAR-100 + ResNet-56 to 0.62%.

Abstract

The choice of activation function plays a crucial role in the optimization and performance of deep neural networks. While the Rectified Linear Unit (ReLU) remains the dominant choice due to its simplicity and effectiveness, its lack of smoothness may hinder gradient-based optimization in deep architectures. In this work we propose a family of $C^{2 N}$ -smooth activation functions whose gate follows a log-logistic CDF, achieving ReLU-like performance with purely rational arithmetic. We introduce three variants: GEM (the base family), E-GEM (an $ϵ$ -parameterized generalization enabling arbitrary $L^{p}$ -approximation of ReLU), and SE-GEM (a piecewise variant eliminating dead neurons with $C^{2 N}$ junction smoothness). An $N$ -ablation study establishes $N = 1$ as optimal for standard-depth networks, reducing the GELU deficit on CIFAR-100 + ResNet-56 from 6.10% to 2.12%. The smoothness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.