Data-aware customization of activation functions reduces neural network   error

Fuchang Gao; Boyu Zhang

arXiv:2301.06635·cs.LG·January 18, 2023·5 cites

Data-aware customization of activation functions reduces neural network error

Fuchang Gao, Boyu Zhang

PDF

Open Access

TL;DR

Customizing activation functions based on data characteristics can significantly reduce neural network error, with the paper proposing criteria, a new 'seagull' activation function, and demonstrating substantial improvements across various tasks.

Contribution

The paper introduces a data-aware approach to customizing activation functions, including a new 'seagull' function, and provides theoretical criteria and empirical evidence of error reduction.

Findings

01

Order-of-magnitude error reduction with 'seagull' activation.

02

Best results when applied to exchangeability-connected layers.

03

Effective in both low- and high-dimensional datasets.

Abstract

Activation functions play critical roles in neural networks, yet current off-the-shelf neural networks pay little attention to the specific choice of activation functions used. Here we show that data-aware customization of activation functions can result in striking reductions in neural network error. We first give a simple linear algebraic explanation of the role of activation functions in neural networks; then, through connection with the Diaconis-Shahshahani Approximation Theorem, we propose a set of criteria for good activation functions. As a case study, we consider regression tasks with a partially exchangeable target function, \emph{i.e.} $f (u, v, w) = f (v, u, w)$ for $u, v \in R^{d}$ and $w \in R^{k}$ , and prove that for such a target function, using an even activation function in at least one of the layers guarantees that the prediction preserves partial exchangeability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications