An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation
Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang, Ji, Antoni B. Chan

TL;DR
This comprehensive empirical study evaluates how pre-training and data augmentation influence machine learning robustness under distribution shifts, revealing key insights for model selection and training strategies.
Contribution
It systematically investigates the impact of pre-training modes and data augmentation on distribution shift robustness across multiple datasets and models.
Findings
ERM with data augmentation achieves state-of-the-art results with proper pre-trained models.
Specialized algorithms like GroupDRO and CORAL further enhance robustness for specific shifts.
Pre-training choices significantly affect model performance under distribution shifts.
Abstract
The performance of machine learning models under distribution shift has been the focus of the community in recent years. Most of current methods have been proposed to improve the robustness to distribution shift from the algorithmic perspective, i.e., designing better training algorithms to help the generalization in shifted test distributions. This paper studies the distribution shift problem from the perspective of pre-training and data augmentation, two important factors in the practice of deep learning that have not been systematically investigated by existing work. By evaluating seven pre-trained models, including ResNets and ViT's with self-supervision and supervision mode, on five important distribution-shift datasets, from WILDS and DomainBed benchmarks, with five different learning algorithms, we provide the first comprehensive empirical study focusing on pre-training and data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks · Machine Learning and ELM
MethodsCorrelation Alignment for Deep Domain Adaptation
