# Static Activation Function Normalization

**Authors:** Pierre H. Richemond, Yike Guo

arXiv: 1905.01369 · 2019-05-07

## TL;DR

This paper introduces static activation normalization, a method to transform activation functions like ReLU to improve neural network convergence, robustness, and depth without additional computational cost, inspired by random matrix theory.

## Contribution

It proposes a novel normalization technique for activation functions that enhances training performance and robustness, inspired by principles from random matrix theory.

## Key findings

- Improves convergence robustness and maximum training depth.
- Enhances anytime performance of neural networks.
- Provides benefits similar to batch normalization without extra computation.

## Abstract

Recent seminal work at the intersection of deep neural networks practice and random matrix theory has linked the convergence speed and robustness of these networks with the combination of random weight initialization and nonlinear activation function in use. Building on those principles, we introduce a process to transform an existing activation function into another one with better properties. We term such transform \emph{static activation normalization}. More specifically we focus on this normalization applied to the ReLU unit, and show empirically that it significantly promotes convergence robustness, maximum training depth, and anytime performance. We verify these claims by examining empirical eigenvalue distributions of networks trained with those activations. Our static activation normalization provides a first step towards giving benefits similar in spirit to schemes like batch normalization, but without computational cost.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01369/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01369/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1905.01369/full.md

---
Source: https://tomesphere.com/paper/1905.01369