Delving into Semantic Scale Imbalance

Yanbiao Ma; Licheng Jiao; Fang Liu; Yuxin Li; Shuyuan Yang; Xu Liu

arXiv:2212.14613·cs.CV·April 11, 2023·5 cites

Delving into Semantic Scale Imbalance

Yanbiao Ma, Licheng Jiao, Fang Liu, Yuxin Li, Shuyuan Yang, Xu Liu

PDF

Open Access

TL;DR

This paper introduces the concept of semantic scale imbalance to better understand model bias in long-tailed data, proposing a new measurement and training framework that improves performance across diverse datasets.

Contribution

It defines and quantifies semantic scale imbalance, and develops a semantic-scale-balanced learning method that enhances model performance on various datasets.

Findings

01

Semantic scale correlates with classification performance.

02

The proposed method improves results on long-tailed datasets.

03

Model bias persists even with balanced data, explained by semantic scale imbalance.

Abstract

Model bias triggered by long-tailed data has been widely studied. However, measure based on the number of samples cannot explicate three phenomena simultaneously: (1) Given enough data, the classification performance gain is marginal with additional samples. (2) Classification performance decays precipitously as the number of training samples decreases when there is insufficient data. (3) Model trained on sample-balanced datasets still has different biases for different classes. In this work, we define and quantify the semantic scale of classes, which is used to measure the feature diversity of classes. It is exciting to find experimentally that there is a marginal effect of semantic scale, which perfectly describes the first two phenomena. Further, the quantitative measurement of semantic scale imbalance is proposed, which can accurately reflect model bias on multiple datasets, even on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Machine Learning in Healthcare · COVID-19 diagnosis using AI