Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity
Xinyu Gong, Wuyang Chen, Tianlong Chen, Zhangyang Wang

TL;DR
Sandwich Batch Normalization (SaBN) is a simple yet effective modification of BN that addresses feature distribution heterogeneity, improving performance across tasks like image generation, NAS, adversarial training, and style transfer.
Contribution
SaBN introduces a novel affine layer structure that promotes balanced gradients and can be easily integrated into existing models, enhancing their performance.
Findings
Improves Inception Score and FID on CIFAR-10 and ImageNet GANs
Boosts NAS-Bench-201 performance significantly
Enhances adversarial robustness and style transfer quality
Abstract
We present Sandwich Batch Normalization (SaBN), a frustratingly easy improvement of Batch Normalization (BN) with only a few lines of code changes. SaBN is motivated by addressing the inherent feature distribution heterogeneity that one can be identified in many tasks, which can arise from data heterogeneity (multiple input domains) or model heterogeneity (dynamic architectures, model conditioning, etc.). Our SaBN factorizes the BN affine layer into one shared sandwich affine layer, cascaded by several parallel independent affine layers. Concrete analysis reveals that, during optimization, SaBN promotes balanced gradient norms while still preserving diverse gradient directions -- a property that many application tasks seem to favor. We demonstrate the prevailing effectiveness of SaBN as a drop-in replacement in four tasks: conditional image generation, neural architecture search (NAS),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity· youtube
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
MethodsSandwich Batch Normalization · Batch Normalization
