Uncertainty-aware Self-training for Text Classification with Few Labels
Subhabrata Mukherjee, Ahmed Hassan Awadallah

TL;DR
This paper introduces an uncertainty-aware self-training method for text classification that effectively utilizes limited labeled data and unlabeled data, achieving high accuracy close to fully supervised models by leveraging Bayesian deep learning techniques.
Contribution
The paper proposes a novel uncertainty-aware self-training approach using MC Dropout for better instance selection and confidence-based learning, improving semi-supervised text classification.
Findings
Achieves within 3% of fully supervised models using only 20-30 labels per class.
Improves accuracy by up to 12% over baseline methods.
Effective on five benchmark datasets with limited labeled data.
Abstract
Recent success of large-scale pre-trained language models crucially hinge on fine-tuning them on large amounts of labeled data for the downstream task, that are typically expensive to acquire. In this work, we study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck by making use of large-scale unlabeled data for the target task. Standard self-training mechanism randomly samples instances from the unlabeled pool to pseudo-label and augment labeled data. In this work, we propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network leveraging recent advances in Bayesian deep learning. Specifically, we propose (i) acquisition functions to select instances from the unlabeled pool leveraging Monte Carlo (MC) Dropout, and (ii) learning mechanism leveraging model confidence for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsDropout
