Robust Deep Network Learning of Nonlinear Regression Tasks by Parametric Leaky Exponential Linear Units (LELUs) and a Diffusion Metric

Enda D.V. Bigarella

arXiv:2507.06765·cs.LG·October 3, 2025

Robust Deep Network Learning of Nonlinear Regression Tasks by Parametric Leaky Exponential Linear Units (LELUs) and a Diffusion Metric

Enda D.V. Bigarella

PDF

TL;DR

This paper introduces a new parametric activation function called Leaky Exponential Linear Unit (LELU) that enhances nonlinear regression in deep networks, along with a diffusion-loss metric to evaluate overfitting.

Contribution

It proposes a novel smooth activation function (LELU) with non-zero gradients for better nonlinear regression and introduces a diffusion-loss metric to assess model overfitting.

Findings

01

LELU improves regression performance over traditional activations.

02

The diffusion-loss metric effectively gauges overfitting.

03

Smooth, trainable activations enhance neural network robustness.

Abstract

This document proposes a parametric activation function (ac.f.) aimed at improving multidimensional nonlinear data regression. It is a established knowledge that nonlinear ac.f's are required for learning nonlinear datasets. This work shows that smoothness and gradient properties of the ac.f. further impact the performance of large neural networks in terms of overfitting and sensitivity to model parameters. Smooth but vanishing-gradient ac.f's such as ELU or SiLU (Swish) have limited performance and non-smooth ac.f's such as RELU and Leaky-RELU further impart discontinuity in the trained model. Improved performance is demonstrated with a smooth "Leaky Exponential Linear Unit", with non-zero gradient that can be trained. A novel diffusion-loss metric is also proposed to gauge the performance of the trained models in terms of overfitting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.