Let Samples Speak: Mitigating Spurious Correlation by Exploiting the Clusterness of Samples
Weiwei Li, Junzhuo Liu, Yuanyuan Ren, Yuchen Zheng, Yahao Liu, Wen Li

TL;DR
This paper introduces a data-driven method to reduce spurious correlations in deep learning by exploiting sample clustering, leading to more unbiased models with significantly improved worst group accuracy.
Contribution
It proposes a novel pipeline that identifies, neutralizes, and eliminates spurious features through sample clustering and feature transformation, enhancing model robustness.
Findings
Over 20% improvement in worst group accuracy on benchmarks
Effective in both image and NLP debiasing tasks
Outperforms standard ERM methods significantly
Abstract
Deep learning models are known to often learn features that spuriously correlate with the class label during training but are irrelevant to the prediction task. Existing methods typically address this issue by annotating potential spurious attributes, or filtering spurious features based on some empirical assumptions (e.g., simplicity of bias). However, these methods may yield unsatisfactory performance due to the intricate and elusive nature of spurious correlations in real-world data. In this paper, we propose a data-oriented approach to mitigate the spurious correlation in deep learning models. We observe that samples that are influenced by spurious features tend to exhibit a dispersed distribution in the learned feature space. This allows us to identify the presence of spurious features. Subsequently, we obtain a bias-invariant representation by neutralizing the spurious features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI
