Filtered Batch Normalization

Andras Horvath; Jalal Al-afandi

arXiv:2010.08251·cs.LG·October 19, 2020

Filtered Batch Normalization

Andras Horvath, Jalal Al-afandi

PDF

TL;DR

This paper challenges the Gaussian assumption of neural network activations, proposing a filtering method to improve batch normalization, leading to faster convergence and higher accuracy.

Contribution

It introduces a filtering approach to stabilize batch normalization by removing out-of-distribution activations, enhancing training efficiency and model performance.

Findings

01

Filtering improves mean and variance stability during training

02

Enhanced batch normalization accelerates convergence

03

Higher validation accuracy achieved with filtering

Abstract

It is a common assumption that the activation of different layers in neural networks follow Gaussian distribution. This distribution can be transformed using normalization techniques, such as batch-normalization, increasing convergence speed and improving accuracy. In this paper we would like to demonstrate, that activations do not necessarily follow Gaussian distribution in all layers. Neurons in deeper layers are more selective and specific which can result extremely large, out-of-distribution activations. We will demonstrate that one can create more consistent mean and variance values for batch normalization during training by filtering out these activations which can further improve convergence speed and yield higher validation accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBatch Normalization