Weak Supervision with Incremental Source Accuracy Estimation

Richard Gresham Correro

arXiv:2205.05302·cs.LG·May 12, 2022

Weak Supervision with Incremental Source Accuracy Estimation

Richard Gresham Correro

PDF

Open Access 1 Repo

TL;DR

This paper presents an incremental method to estimate the dependency structure and accuracy of weak supervision sources in real-time, enabling dynamic label generation for streaming data with accuracy comparable to offline methods.

Contribution

It introduces a novel incremental approach for dependency and accuracy estimation of weak supervision sources, suitable for real-time data labeling.

Findings

01

Achieves probabilistic labels with accuracy comparable to offline methods.

02

Works with both classification models and heuristic functions as sources.

03

Effectively updates source accuracy estimates as new data arrives.

Abstract

Motivated by the desire to generate labels for real-time data we develop a method to estimate the dependency structure and accuracy of weak supervision sources incrementally. Our method first estimates the dependency structure associated with the supervision sources and then uses this to iteratively update the estimated source accuracies as new data is received. Using both off-the-shelf classification models trained using publicly-available datasets and heuristic functions as supervision sources we show that our method generates probabilistic labels with an accuracy matching that of existing off-line methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rcorrero/on-line-weak-supervision
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Topic Modeling