Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
Yujin Han, Difan Zou

TL;DR
This paper introduces GIC, a novel method for more accurately inferring group labels to improve worst-group accuracy in models affected by spurious correlations, without relying on expensive annotations.
Contribution
GIC leverages properties of spurious correlations to infer group labels more precisely, enhancing robustness and compatibility with various invariant learning methods.
Findings
GIC improves worst-group accuracy across multiple datasets.
Combining GIC with invariant learning methods yields significant performance gains.
Analysis of misclassifications reveals semantic consistency aiding in decoupling spurious correlations.
Abstract
Standard empirical risk minimization (ERM) models may prioritize learning spurious correlations between spurious features and true labels, leading to poor accuracy on groups where these correlations do not hold. Mitigating this issue often requires expensive spurious attribute (group) labels or relies on trained ERM models to infer group labels when group information is unavailable. However, the significant performance gap in worst-group accuracy between using pseudo group labels and using oracle group labels inspires us to consider further improving group robustness through preciser group inference. Therefore, we propose GIC, a novel method that accurately infers group labels, resulting in improved worst-group performance. GIC trains a spurious attribute classifier based on two key properties of spurious correlations: (1) high correlation between spurious attributes and true labels,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Advanced Statistical Modeling Techniques · Text and Document Classification Technologies
MethodsGraph InfoClust
