The effect of Leaky ReLUs on the training and generalization of   overparameterized networks

Yinglong Guo; Shaohan Li; Gilad Lerman

arXiv:2402.11942·cs.LG·February 27, 2024·3 cites

The effect of Leaky ReLUs on the training and generalization of overparameterized networks

Yinglong Guo, Shaohan Li, Gilad Lerman

PDF

Open Access

TL;DR

This paper analyzes how leaky ReLU activation functions influence training and generalization errors in overparameterized neural networks, providing bounds that depend on the Leaky ReLU parameter and identifying the optimal value.

Contribution

It offers theoretical bounds on training and generalization errors for overparameterized networks with leaky ReLUs, highlighting the optimality of the absolute value activation.

Findings

01

α = -1 (absolute value) yields optimal training error bounds

02

Numerical experiments support the theoretical optimality of α = -1

03

Bounds depend on Leaky ReLU parameter, guiding practical choices

Abstract

We investigate the training and generalization errors of overparameterized neural networks (NNs) with a wide class of leaky rectified linear unit (ReLU) functions. More specifically, we carefully upper bound both the convergence rate of the training error and the generalization error of such NNs and investigate the dependence of these bounds on the Leaky ReLU parameter, $α$ . We show that $α = - 1$ , which corresponds to the absolute value activation function, is optimal for the training error bound. Furthermore, in special settings, it is also optimal for the generalization error bound. Numerical experiments empirically support the practical choices guided by the theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Neural Networks and Applications · Fuzzy Logic and Control Systems

MethodsHuMan(Expedia)||How do I get a human at Expedia?