Compositional Attribute Imbalance in Vision Datasets
Jiayi Chen, Yanbiao Ma, Andi Zhang, Weidong Tang, Wei Dai, Bowei Liu

TL;DR
This paper addresses the issue of attribute imbalance in image datasets by introducing a CLIP-based framework for automatic attribute evaluation and proposing sampling and augmentation strategies to improve model robustness and fairness.
Contribution
It presents a novel CLIP-based method to analyze attribute imbalance and introduces a sampling adjustment combined with augmentation techniques to mitigate imbalance effects.
Findings
Improved model performance on imbalanced datasets.
Enhanced robustness and fairness of image classifiers.
Effective mitigation of attribute imbalance through proposed methods.
Abstract
Visual attribute imbalance is a common yet underexplored issue in image classification, significantly impacting model performance and generalization. In this work, we first define the first-level and second-level attributes of images and then introduce a CLIP-based framework to construct a visual attribute dictionary, enabling automatic evaluation of image attributes. By systematically analyzing both single-attribute imbalance and compositional attribute imbalance, we reveal how the rarity of attributes affects model performance. To tackle these challenges, we propose adjusting the sampling probability of samples based on the rarity of their compositional attributes. This strategy is further integrated with various data augmentation techniques (such as CutMix, Fmix, and SaliencyMix) to enhance the model's ability to represent rare attributes. Extensive experiments on benchmark datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGeochemistry and Geologic Mapping
MethodsCutMix
