Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching
Marthinus Du Plessis (Tokyo Institute of Technology), Masashi Sugiyama, (Tokyo Institute of Technology)

TL;DR
This paper introduces a method to estimate class ratios in test datasets with unknown class balance by matching input data distributions, addressing bias issues in real-world classification tasks.
Contribution
It proposes a novel distribution matching approach for class ratio estimation under class-prior change without requiring labeled test data.
Findings
Effective class ratio estimation demonstrated in experiments
Reduces bias caused by class imbalance in test data
Applicable to real-world classification scenarios
Abstract
In real-world classification problems, the class balance in the training dataset does not necessarily reflect that of the test dataset, which can cause significant estimation bias. If the class ratio of the test dataset is known, instance re-weighting or resampling allows systematical bias correction. However, learning the class ratio of the test dataset is challenging when no labeled data is available from the test domain. In this paper, we propose to estimate the class ratio in the test dataset by matching probability distributions of training and test input data. We demonstrate the utility of the proposed approach through experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Imbalanced Data Classification Techniques · Anomaly Detection Techniques and Applications
