Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification
Ming-Chang Chiu, Pin-Yu Chen, Xuezhe Ma

TL;DR
This paper investigates how natural background colors as spurious features affect image classification, introduces datasets with annotated backgrounds, and proposes a semantic data augmentation method, FlowAug, to improve subgroup performance and robustness.
Contribution
It provides annotated datasets highlighting background spurious features, introduces FlowAug for semantic data augmentation, and proposes MacroStd as a new metric for model robustness to spurious correlations.
Findings
FlowAug improves subgroup consistency and generalization.
Background color influences model performance across datasets.
MacroStd correlates with improved robustness and subgroup performance.
Abstract
In this paper, we provide 20,000 non-trivial human annotations on popular datasets as a first step to bridge gap to studying how natural semantic spurious features affect image classification, as prior works often study datasets mixing low-level features due to limitations in accessing realistic datasets. We investigate how natural background colors play a role as spurious features by annotating the test sets of CIFAR10 and CIFAR100 into subgroups based on the background color of each image. We name our datasets \textbf{CIFAR10-B} and \textbf{CIFAR100-B} and integrate them with CIFAR-Cs. We find that overall human-level accuracy does not guarantee consistent subgroup performances, and the phenomenon remains even on models pre-trained on ImageNet or after data augmentation (DA). To alleviate this issue, we propose \textbf{FlowAug}, a \emph{semantic} DA that leverages decoupled semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification· youtube
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications
MethodsTest
