A Survey on Programmatic Weak Supervision
Jieyu Zhang, Cheng-Yu Hsieh, Yue Yu, Chao Zhang, Alexander Ratner

TL;DR
This survey comprehensively reviews recent advances in programmatic weak supervision, highlighting its role in reducing manual labeling efforts and discussing related paradigms and future challenges.
Contribution
It provides a structured overview of PWS components, compares it with related paradigms, and identifies open challenges for future research.
Findings
PWS effectively synthesizes training labels from noisy sources.
Recent approaches improve label quality and learning efficiency.
Several challenges remain, including handling noise and scalability.
Abstract
Labeling training data has become one of the major roadblocks to using machine learning. Among various weak supervision paradigms, programmatic weak supervision (PWS) has achieved remarkable success in easing the manual labeling bottleneck by programmatically synthesizing training labels from multiple potentially noisy supervision sources. This paper presents a comprehensive survey of recent advances in PWS. In particular, we give a brief introduction of the PWS learning paradigm, and review representative approaches for each component within PWS's learning workflow. In addition, we discuss complementary learning paradigms for tackling limited labeled data scenarios and how these related approaches can be used in conjunction with PWS. Finally, we identify several critical challenges that remain under-explored in the area to hopefully inspire future research directions in the field.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Image Retrieval and Classification Techniques
