Quantifying With Only Positive Training Data
Denis dos Reis, Marc\'ilio de Souto, Elaine de Sousa, Gustavo Batista

TL;DR
This paper introduces a new setting called One-class Quantification (OCQ) and proposes the Passive Aggressive Threshold (PAT) method, unifying PUL and OCQ, demonstrating that PAT is fast and accurate, with ExTIcE improving negative sample quantification.
Contribution
The paper unifies PUL and OCQ under a common framework and introduces PAT and ExTIcE algorithms, advancing quantification methods for scenarios with limited class information.
Findings
PAT is generally the fastest and most accurate algorithm.
ExTIcE outperforms other methods in scenarios with identical positive and negative observations.
PAT models can be reused for different data samples.
Abstract
Quantification is the research field that studies methods for counting the number of data points that belong to each class in an unlabeled sample. Traditionally, researchers in this field assume the availability of labelled observations for all classes to induce a quantification model. However, we often face situations where the number of classes is large or even unknown, or we have reliable data for a single class. When inducing a multi-class quantifier is infeasible, we are often concerned with estimates for a specific class of interest. In this context, we have proposed a novel setting known as One-class Quantification (OCQ). In contrast, Positive and Unlabeled Learning (PUL), another branch of Machine Learning, has offered solutions to OCQ, despite quantification not being the focal point of PUL. This article closes the gap between PUL and OCQ and brings both areas together under a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Anomaly Detection Techniques and Applications
