Spurious Feature Diversification Improves Out-of-distribution Generalization
Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang,, Yujiu Yang, Tong Zhang

TL;DR
This paper investigates how ensemble methods, especially WiSE-FT, improve out-of-distribution generalization by leveraging diverse spurious features, supported by theoretical analysis and empirical validation, and introduces a new averaging technique called BANG.
Contribution
It provides a new theoretical understanding of ensemble effectiveness in OOD settings and proposes a novel averaging method to further enhance performance.
Findings
WiSE-FT corrects many incorrect individual predictions, boosting OOD performance.
Ensemble models utilize diverse spurious features to reduce errors in OOD data.
The BANG averaging method significantly improves WiSE-FT's OOD generalization.
Abstract
Generalization to out-of-distribution (OOD) data is a critical challenge in machine learning. Ensemble-based methods, like weight space ensembles that interpolate model parameters, have been shown to achieve superior OOD performance. However, the underlying mechanism for their effectiveness remains unclear. In this study, we closely examine WiSE-FT, a popular weight space ensemble method that interpolates between a pre-trained and a fine-tuned model. We observe an unexpected ``FalseFalseTrue" phenomenon, in which WiSE-FT successfully corrects many cases where each individual model makes incorrect predictions, which contributes significantly to its OOD effectiveness. To gain further insights, we conduct theoretical analysis in a multi-class setting with a large number of spurious features. Our analysis predicts the above phenomenon and it further shows that ensemble-based models reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
