Noisy Activation Functions
Caglar Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio

TL;DR
This paper introduces noisy activation functions that inject noise into saturated regions of neural network activations, facilitating gradient flow and improving training, especially in challenging scenarios like curriculum learning.
Contribution
It proposes a novel method of adding targeted noise to activation functions to enhance gradient-based optimization in neural networks.
Findings
Noisy activations improve training stability and convergence.
The approach achieves state-of-the-art results on various datasets.
Noise injection helps in training difficult models with saturation issues.
Abstract
Common nonlinear activation functions used in neural networks can cause training difficulties due to the saturation behavior of the activation function, which may hide dependencies that are not visible to vanilla-SGD (using first order gradients only). Gating mechanisms that use softly saturating activation functions to emulate the discrete switching of digital logic circuits are good examples of this. We propose to exploit the injection of appropriate noise so that the gradients may flow easily, even if the noiseless application of the activation function would yield zero gradient. Large noise will dominate the noise-free gradient and allow stochastic gradient descent toexplore more. By adding noise only to the problematic parts of the activation function, we allow the optimization procedure to explore the boundary between the degenerate (saturating) and the well-behaved parts of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques
