Large Deviations of Gaussian Neural Networks with ReLU activation
Quirin Vogel

TL;DR
This paper establishes a large deviation principle for Gaussian neural networks with ReLU activation, extending previous work to more practical activation functions and providing simplified rate functions and series expansions.
Contribution
It generalizes large deviation results to ReLU and similar activations, offering simplified formulas and series expansions for the rate function.
Findings
Large deviation principle proven for ReLU-activated Gaussian neural networks
Simplified expressions for the rate function are provided
Power-series expansions for ReLU case are developed
Abstract
We prove a large deviation principle for deep neural networks with Gaussian weights and at most linearly growing activation functions, such as ReLU. This generalises earlier work, in which bounded and continuous activation functions were considered. In practice, linearly growing activation functions such as ReLU are most commonly used. We furthermore simplify previous expressions for the rate function and provide a power-series expansions for the ReLU case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
