End-to-End Weak Supervision

Salva R\"uhling Cachay; Benedikt Boecking; Artur Dubrawski

arXiv:2107.02233·cs.LG·December 1, 2021

End-to-End Weak Supervision

Salva R\"uhling Cachay, Benedikt Boecking, Artur Dubrawski

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an end-to-end method for weak supervision that directly optimizes the downstream model, leading to better performance and robustness compared to traditional two-step approaches.

Contribution

It proposes a novel end-to-end framework that integrates weak source modeling with downstream model training, bypassing the limitations of separate probabilistic modeling.

Findings

01

Improved downstream test set performance.

02

Enhanced robustness to source dependencies.

03

Outperforms prior weak supervision methods.

Abstract

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art approaches that do not use any labeled training data, however, require two separate modeling steps: Learning a probabilistic latent variable model based on the WS sources -- making assumptions that rarely hold in practice -- followed by downstream model training. Importantly, the first step of modeling does not consider the performance of the downstream model. To address these caveats we propose an end-to-end approach for directly learning the downstream model by maximizing its agreement with probabilistic labels generated by reparameterizing previous probabilistic posteriors with a neural network. Our results show improved performance over prior work in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

autonlab/weasel
pytorchOfficial

Videos

End-to-End Weak Supervision· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning