Positive and Unlabeled Learning through Negative Selection and Imbalance-aware Classification
Marco Frasca, Nicol\`o Cesa-Bianchi

TL;DR
This paper introduces a novel learning algorithm for positive and unlabeled data that combines active negative example selection with imbalance-aware classification, improving performance in protein function prediction tasks.
Contribution
It presents a new method integrating active learning and imbalance-aware techniques specifically for PU learning, addressing label scarcity and class imbalance.
Findings
Outperforms state-of-the-art methods on protein function prediction benchmarks
Active negative selection and imbalance-aware learning work synergistically
Effective in scenarios with scarce positive labels and no explicit negatives
Abstract
Motivated by applications in protein function prediction, we consider a challenging supervised classification setting in which positive labels are scarce and there are no explicit negative labels. The learning algorithm must thus select which unlabeled examples to use as negative training points, possibly ending up with an unbalanced learning problem. We address these issues by proposing an algorithm that combines active learning (for selecting negative examples) with imbalance-aware learning (for mitigating the label imbalance). In our experiments we observe that these two techniques operate synergistically, outperforming state-of-the-art methods on standard protein function prediction benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Machine Learning and Algorithms
