Exponential Approximation Rates and Parameter Efficiency of Learnable Bernstein Activations

Ibrahim Albool; Malak Gamal El-Din; Salma Elmalaki; Yasser Shoukry

arXiv:2602.04264·cs.LG·May 14, 2026

Exponential Approximation Rates and Parameter Efficiency of Learnable Bernstein Activations

Ibrahim Albool, Malak Gamal El-Din, Salma Elmalaki, Yasser Shoukry

PDF

TL;DR

This paper introduces learnable Bernstein polynomial activations in deep neural networks, demonstrating exponential approximation error decay and significant parameter efficiency improvements over traditional activations.

Contribution

The paper provides a theoretical analysis of DeepBern-Nets with learnable Bernstein activations, showing exponential approximation rates and validating these with extensive experiments.

Findings

01

DBNs achieve over 70% parameter reduction compared to ReLU-based networks.

02

DBNs converge faster, reaching ReLU's loss in as few as 26% of training epochs.

03

DBNs attain up to 45% lower final loss than traditional activation functions.

Abstract

The choice of activation function fundamentally shapes the representational capacity and parameter efficiency of deep neural networks, yet most widely used activations lack rigorous theoretical guarantees on these properties. We provide a theoretical analysis of DeepBern-Nets (DBNs) -- networks employing learnable Bernstein polynomial activations -- showing that their approximation error decays with the network depth $L$ and the polynomial order $n$ with a rate of $O (n^{- L})$ , exponentially faster than the polynomial rate of ReLU architectures while remaining fully differentiable. We validate these predictions through $1, 344$ experiments on large scientific datasets (HIGGS and SUSY), comparing DBNs against ReLU, Leaky ReLU, SELU, and GeLU. DBNs achieve over $70%$ parameter reduction across the majority of architectures -- reaching $99.9%$ at scale -- converge to ReLU's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.