Compositional Attribute Imbalance in Vision Datasets

Jiayi Chen; Yanbiao Ma; Andi Zhang; Weidong Tang; Wei Dai; Bowei Liu

arXiv:2506.14418·cs.CV·June 18, 2025

Compositional Attribute Imbalance in Vision Datasets

Jiayi Chen, Yanbiao Ma, Andi Zhang, Weidong Tang, Wei Dai, Bowei Liu

PDF

Open Access 1 Video

TL;DR

This paper addresses the issue of attribute imbalance in image datasets by introducing a CLIP-based framework for automatic attribute evaluation and proposing sampling and augmentation strategies to improve model robustness and fairness.

Contribution

It presents a novel CLIP-based method to analyze attribute imbalance and introduces a sampling adjustment combined with augmentation techniques to mitigate imbalance effects.

Findings

01

Improved model performance on imbalanced datasets.

02

Enhanced robustness and fairness of image classifiers.

03

Effective mitigation of attribute imbalance through proposed methods.

Abstract

Visual attribute imbalance is a common yet underexplored issue in image classification, significantly impacting model performance and generalization. In this work, we first define the first-level and second-level attributes of images and then introduce a CLIP-based framework to construct a visual attribute dictionary, enabling automatic evaluation of image attributes. By systematically analyzing both single-attribute imbalance and compositional attribute imbalance, we reveal how the rarity of attributes affects model performance. To tackle these challenges, we propose adjusting the sampling probability of samples based on the rarity of their compositional attributes. This strategy is further integrated with various data augmentation techniques (such as CutMix, Fmix, and SaliencyMix) to enhance the model's ability to represent rare attributes. Extensive experiments on benchmark datasets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Compositional Attribute Imbalance in Vision Datasets· underline

Taxonomy

TopicsGeochemistry and Geologic Mapping

MethodsCutMix