Class-Balanced Loss Based on Effective Number of Samples
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, Serge Belongie

TL;DR
This paper introduces a novel class-balanced loss function based on the effective number of samples, which improves performance on long-tailed datasets by re-weighting classes according to a new theoretical measure.
Contribution
The paper proposes a new theoretical framework for measuring data overlap and defines the effective number of samples to enhance class re-balancing in long-tailed datasets.
Findings
Significant performance improvements on long-tailed CIFAR datasets.
Effective number-based re-weighting outperforms traditional methods.
Successful application to large-scale datasets like ImageNet and iNaturalist.
Abstract
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula , where is the number of samples and is a hyperparameter. We design a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · COVID-19 diagnosis using AI
