Filtered Batch Normalization
Andras Horvath, Jalal Al-afandi

TL;DR
This paper challenges the Gaussian assumption of neural network activations, proposing a filtering method to improve batch normalization, leading to faster convergence and higher accuracy.
Contribution
It introduces a filtering approach to stabilize batch normalization by removing out-of-distribution activations, enhancing training efficiency and model performance.
Findings
Filtering improves mean and variance stability during training
Enhanced batch normalization accelerates convergence
Higher validation accuracy achieved with filtering
Abstract
It is a common assumption that the activation of different layers in neural networks follow Gaussian distribution. This distribution can be transformed using normalization techniques, such as batch-normalization, increasing convergence speed and improving accuracy. In this paper we would like to demonstrate, that activations do not necessarily follow Gaussian distribution in all layers. Neurons in deeper layers are more selective and specific which can result extremely large, out-of-distribution activations. We will demonstrate that one can create more consistent mean and variance values for batch normalization during training by filtering out these activations which can further improve convergence speed and yield higher validation accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsBatch Normalization
