ULF: Unsupervised Labeling Function Correction using Cross-Validation for Weak Supervision
Anastasiia Sedova, Benjamin Roth

TL;DR
ULF is an innovative unsupervised method that improves weak supervision by correcting labeling functions through cross-validation, leading to more accurate data annotations without manual effort.
Contribution
The paper introduces ULF, a novel algorithm that denoises weak supervision data by correcting labeling functions using cross-validation, enhancing label quality without manual labeling.
Findings
ULF improves weak supervision accuracy across multiple datasets.
ULF effectively corrects biases in labeling functions.
Enhanced data quality leads to better model performance.
Abstract
A cost-effective alternative to manual data labeling is weak supervision (WS), where data samples are automatically annotated using a predefined set of labeling functions (LFs), rule-based mechanisms that generate artificial labels for the associated classes. In this work, we investigate noise reduction techniques for WS based on the principle of k-fold cross-validation. We introduce a new algorithm ULF for Unsupervised Labeling Function correction, which denoises WS data by leveraging models trained on all but some LFs to identify and correct biases specific to the held-out LFs. Specifically, ULF refines the allocation of LFs to classes by re-estimating this assignment on highly reliable cross-validated samples. Evaluation on multiple datasets confirms ULF's effectiveness in enhancing WS learning without the need for manual labeling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Rough Sets and Fuzzy Logic · Music and Audio Processing
