The effect of Leaky ReLUs on the training and generalization of overparameterized networks
Yinglong Guo, Shaohan Li, Gilad Lerman

TL;DR
This paper analyzes how leaky ReLU activation functions influence training and generalization errors in overparameterized neural networks, providing bounds that depend on the Leaky ReLU parameter and identifying the optimal value.
Contribution
It offers theoretical bounds on training and generalization errors for overparameterized networks with leaky ReLUs, highlighting the optimality of the absolute value activation.
Findings
α = -1 (absolute value) yields optimal training error bounds
Numerical experiments support the theoretical optimality of α = -1
Bounds depend on Leaky ReLU parameter, guiding practical choices
Abstract
We investigate the training and generalization errors of overparameterized neural networks (NNs) with a wide class of leaky rectified linear unit (ReLU) functions. More specifically, we carefully upper bound both the convergence rate of the training error and the generalization error of such NNs and investigate the dependence of these bounds on the Leaky ReLU parameter, . We show that , which corresponds to the absolute value activation function, is optimal for the training error bound. Furthermore, in special settings, it is also optimal for the generalization error bound. Numerical experiments empirically support the practical choices guided by the theory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Fuzzy Logic and Control Systems
MethodsHuMan(Expedia)||How do I get a human at Expedia?
