Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift
Zachary Nado, Shreyas Padhy, D. Sculley, Alexander D'Amour, Balaji, Lakshminarayanan, Jasper Snoek

TL;DR
This paper introduces prediction-time batch normalization, a simple method that leverages small unlabeled data batches before prediction to significantly improve deep learning model robustness and calibration under covariate shift, achieving state-of-the-art results.
Contribution
The paper proposes a novel, easy-to-implement prediction-time batch normalization technique that enhances model robustness under covariate shift without retraining.
Findings
Achieves state-of-the-art results on covariate shift benchmarks.
Improves model accuracy and calibration significantly.
Complementary to existing robustness methods like deep ensembles.
Abstract
Covariate shift has been shown to sharply degrade both predictive accuracy and the calibration of uncertainty estimates for deep learning models. This is worrying, because covariate shift is prevalent in a wide range of real world deployment settings. However, in this paper, we note that frequently there exists the potential to access small unlabeled batches of the shifted data just before prediction time. This interesting observation enables a simple but surprisingly effective method which we call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. Using this one line code change, we achieve state-of-the-art on recent covariate shift benchmarks and an mCE of 60.28\% on the challenging ImageNet-C dataset; to our knowledge, this is the best result for any model that does not incorporate additional data augmentation or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsBatch Normalization
