Imbalanced Classification under Capacity Constraints
Daniel Fraiman, Ricardo Fraiman

TL;DR
This paper introduces a classification framework that manages imbalanced data by controlling positive prediction rates, optimizing detection under capacity limits, and extending to online decision-making.
Contribution
It presents a novel approach that explicitly enforces user-defined bounds on positive predictions, improving over traditional resampling methods like SMOTE.
Findings
Capacity-aware classification improves detection performance.
The method extends naturally to online, real-time decision settings.
It outperforms classical resampling techniques in imbalanced scenarios.
Abstract
In many classification settings, the class of primary interest is underrepresented, leading to imbalanced data problems that arise in applications such as rare disease detection and fraud identification. In these contexts, identifying a potential positive instance typically triggers costly follow-up actions, such as medical imaging or detailed transaction inspection, which are subject to limited operational capacity. Motivated by this setting, we consider classification problems where data may arrive sequentially and decisions must be made under constraints on the number of instances that can be selected for further analysis. We propose a classification framework that explicitly controls the rate of positive predictions, enforcing a user-defined bound on the proportion of observations classified as belonging to the minority class while maximizing detection performance. The approach can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
