AutoWS: Automated Weak Supervision Framework for Text Classification
Abhinav Bohra, Huy Nguyen, Devashish Khatwani

TL;DR
AutoWS is an automated framework that enhances weak supervision for text classification by automatically generating labeling functions, reducing reliance on domain experts, and improving labeling efficiency and model performance.
Contribution
AutoWS introduces a fully automatic, hyper-parameter-free framework that generates labeling functions from minimal labeled data, advancing weak supervision techniques for text classification.
Findings
Outperforms state-of-the-art weak supervision methods
Requires fewer labeled examples to generate effective labels
Achieves higher accuracy in text classification tasks
Abstract
Creating large, good quality labeled data has become one of the major bottlenecks for developing machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot learning, weak supervision) or to improve the efficiency of labeling process (active learning). Among those, Weak Supervision has been shown to reduce labeling costs by employing hand crafted labeling functions designed by domain experts. We propose AutoWS -- a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts. Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data. Noisy labels can then be aggregated into probabilistic labels used by a downstream discriminative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Imbalanced Data Classification Techniques
