Label Augmentation with Reinforced Labeling for Weak Supervision
G\"urkan Solmaz, Flavio Cirillo, Fabio Maresca, Anagha Gode Anil Kumar

TL;DR
This paper introduces reinforced labeling, a method that enhances weak supervision by augmenting labeling functions with similarity-based inference, significantly improving classifier performance across multiple domains.
Contribution
It proposes a novel reinforced labeling approach that leverages data features and similarities to increase labeling coverage in weak supervision, addressing limitations of existing data programming methods.
Findings
Up to +21 points in accuracy
Up to +61 points in F1 scores
Effective across diverse domains
Abstract
Weak supervision (WS) is an alternative to the traditional supervised learning to address the need for ground truth. Data programming is a practical WS approach that allows programmatic labeling data samples using labeling functions (LFs) instead of hand-labeling each data point. However, the existing approach fails to fully exploit the domain knowledge encoded into LFs, especially when the LFs' coverage is low. This is due to the common data programming pipeline that neglects to utilize data features during the generative process. This paper proposes a new approach called reinforced labeling (RL). Given an unlabeled dataset and a set of LFs, RL augments the LFs' outputs to cases not covered by LFs based on similarities among samples. Thus, RL can lead to higher labeling coverage for training an end classifier. The experiments on several domains (classification of YouTube comments, wine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis · Video Analysis and Summarization
