Mode Normalization
Lucas Deecke, Iain Murray, Hakan Bilen

TL;DR
This paper introduces Mode Normalization, a flexible normalization technique that detects data modes on-the-fly, improving training stability and performance over traditional batch normalization, especially with multi-modal data distributions.
Contribution
The paper proposes Mode Normalization, extending normalization to multiple means and variances to adaptively normalize data modes during training.
Findings
Outperforms batch normalization in various experiments
Effective on single and multi-task datasets
Enhances training stability and performance
Abstract
Normalization methods are a central building block in the deep learning toolbox. They accelerate and stabilize training, while decreasing the dependence on manually tuned learning rate schedules. When learning from multi-modal distributions, the effectiveness of batch normalization (BN), arguably the most prominent normalization method, is reduced. As a remedy, we propose a more flexible approach: by extending the normalization to more than a single mean and variance, we detect modes of data on-the-fly, jointly normalizing samples that share common features. We demonstrate that our method outperforms BN and other widely used normalization techniques in several experiments, including single and multi-task datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
MethodsMode Normalization · Batch Normalization
