Label Augmentation with Reinforced Labeling for Weak Supervision

G\"urkan Solmaz; Flavio Cirillo; Fabio Maresca; Anagha Gode Anil Kumar

arXiv:2204.06436·cs.LG·April 14, 2022

Label Augmentation with Reinforced Labeling for Weak Supervision

G\"urkan Solmaz, Flavio Cirillo, Fabio Maresca, Anagha Gode Anil Kumar

PDF

Open Access

TL;DR

This paper introduces reinforced labeling, a method that enhances weak supervision by augmenting labeling functions with similarity-based inference, significantly improving classifier performance across multiple domains.

Contribution

It proposes a novel reinforced labeling approach that leverages data features and similarities to increase labeling coverage in weak supervision, addressing limitations of existing data programming methods.

Findings

01

Up to +21 points in accuracy

02

Up to +61 points in F1 scores

03

Effective across diverse domains

Abstract

Weak supervision (WS) is an alternative to the traditional supervised learning to address the need for ground truth. Data programming is a practical WS approach that allows programmatic labeling data samples using labeling functions (LFs) instead of hand-labeling each data point. However, the existing approach fails to fully exploit the domain knowledge encoded into LFs, especially when the LFs' coverage is low. This is due to the common data programming pipeline that neglects to utilize data features during the generative process. This paper proposes a new approach called reinforced labeling (RL). Given an unlabeled dataset and a set of LFs, RL augments the LFs' outputs to cases not covered by LFs based on similarities among samples. Thus, RL can lead to higher labeling coverage for training an end classifier. The experiments on several domains (classification of YouTube comments, wine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis · Video Analysis and Summarization