On the Out-of-Distribution Generalization of Self-Supervised Learning
Wenwen Qiang, Jingyao Wang, Zeen Song, Jiangmeng Li, Changwen Zheng

TL;DR
This paper investigates the out-of-distribution generalization of self-supervised learning, identifying spurious correlations as a key issue and proposing a causal inference-based sampling strategy to improve OOD robustness.
Contribution
It introduces a novel post-intervention distribution framework and a batch sampling method grounded in causal inference to enhance OOD generalization in SSL.
Findings
Proposed a causal inference framework for SSL OOD analysis
Developed a batch sampling strategy satisfying PID constraints
Validated improved OOD performance through experiments
Abstract
In this paper, we focus on the out-of-distribution (OOD) generalization of self-supervised learning (SSL). By analyzing the mini-batch construction during the SSL training phase, we first give one plausible explanation for SSL having OOD generalization. Then, from the perspective of data generation and causal inference, we analyze and conclude that SSL learns spurious correlations during the training process, which leads to a reduction in OOD generalization. To address this issue, we propose a post-intervention distribution (PID) grounded in the Structural Causal Model. PID offers a scenario where the spurious variable and label variable is mutually independent. Besides, we demonstrate that if each mini-batch during SSL training satisfies PID, the resulting SSL model can achieve optimal worst-case OOD performance. This motivates us to develop a batch sampling strategy that enforces PID…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Neural Networks and Applications
MethodsFocus
