A Survey on Programmatic Weak Supervision

Jieyu Zhang; Cheng-Yu Hsieh; Yue Yu; Chao Zhang; Alexander Ratner

arXiv:2202.05433·cs.LG·February 15, 2022·40 cites

A Survey on Programmatic Weak Supervision

Jieyu Zhang, Cheng-Yu Hsieh, Yue Yu, Chao Zhang, Alexander Ratner

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This survey comprehensively reviews recent advances in programmatic weak supervision, highlighting its role in reducing manual labeling efforts and discussing related paradigms and future challenges.

Contribution

It provides a structured overview of PWS components, compares it with related paradigms, and identifies open challenges for future research.

Findings

01

PWS effectively synthesizes training labels from noisy sources.

02

Recent approaches improve label quality and learning efficiency.

03

Several challenges remain, including handling noise and scalability.

Abstract

Labeling training data has become one of the major roadblocks to using machine learning. Among various weak supervision paradigms, programmatic weak supervision (PWS) has achieved remarkable success in easing the manual labeling bottleneck by programmatically synthesizing training labels from multiple potentially noisy supervision sources. This paper presents a comprehensive survey of recent advances in PWS. In particular, we give a brief introduction of the PWS learning paradigm, and review representative approaches for each component within PWS's learning workflow. In addition, we discuss complementary learning paradigms for tackling limited labeled data scenarios and how these related approaches can be used in conjunction with PWS. Finally, we identify several critical challenges that remain under-explored in the area to hopefully inspire future research directions in the field.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JieyuZ2/Awesome-Weak-Supervision
noneOfficial

Datasets

jieyuz2/WRENCH
dataset· 435 dl
435 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Image Retrieval and Classification Techniques