Binary Classification with Positive Labeling Sources
Jieyu Zhang, Yujing Wang, Yaming Yang, Yang Luo, Alexander Ratner

TL;DR
This paper introduces WEAPO, a weak supervision method designed for binary classification tasks with only positive labeling sources, demonstrating superior performance across multiple benchmarks.
Contribution
The paper proposes WEAPO, a novel weak supervision approach that effectively generates training labels using only positive sources, addressing a common challenge in imbalanced classification tasks.
Findings
WEAPO outperforms existing methods on 10 benchmark datasets.
It achieves higher label quality and classifier performance.
The method is integrated into the WRENCH benchmarking platform.
Abstract
To create a large amount of training labels for machine learning models effectively and efficiently, researchers have turned to Weak Supervision (WS), which uses programmatic labeling sources rather than manual annotation. Existing works of WS for binary classification typically assume the presence of labeling sources that are able to assign both positive and negative labels to data in roughly balanced proportions. However, for many tasks of interest where there is a minority positive class, negative examples could be too diverse for developers to generate indicative labeling sources. Thus, in this work, we study the application of WS on binary classification tasks with positive labeling sources only. We propose WEAPO, a simple yet competitive WS method for producing training labels without negative labeling sources. On 10 benchmark datasets, we show WEAPO achieves the highest averaged…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Water Systems and Optimization · Anomaly Detection Techniques and Applications
