SepLL: Separating Latent Class Labels from Weak Supervision Noise
Andreas Stephan, Vasiliki Kougia, Benjamin Roth

TL;DR
SepLL is a novel deep learning approach that separates task-related and labeling function-specific information in the latent space, improving weak supervision learning without pre-processing.
Contribution
It introduces an end-to-end transformer-based model that disentangles labeling function noise from true labels in the latent space, enhancing weak supervision learning.
Findings
Achieves state-of-the-art performance on Wrench text classification.
Effectively separates label noise from true signal in latent space.
No pre-processing or label correction needed.
Abstract
In the weakly supervised learning paradigm, labeling functions automatically assign heuristic, often noisy, labels to data samples. In this work, we provide a method for learning from weak labels by separating two types of complementary information associated with the labeling functions: information related to the target label and information specific to one labeling function only. Both types of information are reflected to different degrees by all labeled instances. In contrast to previous works that aimed at correcting or removing wrongly labeled instances, we learn a branched deep model that uses all data as-is, but splits the labeling function information in the latent space. Specifically, we propose the end-to-end model SepLL which extends a transformer classifier by introducing a latent space for labeling function specific and task-specific information. The learning signal is only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Human Pose and Action Recognition
