Adaptive Self-training for Few-shot Neural Sequence Labeling
Yaqing Wang, Subhabrata Mukherjee, Haoda Chu, Yuancheng Tu, Ming Wu,, Jing Gao, Ahmed Hassan Awadallah

TL;DR
This paper introduces adaptive self-training combined with meta-learning to improve neural sequence labeling in low-resource settings, significantly enhancing performance with minimal labeled data.
Contribution
It proposes a novel combination of self-training and meta-learning techniques specifically designed for few-shot neural sequence labeling tasks.
Findings
Achieves 10% improvement over state-of-the-art with only 10 labeled examples per class.
Demonstrates effectiveness across six benchmark datasets, including multilingual NER and dialog slot tagging.
Validates the approach's robustness in low-resource, high-privacy scenarios.
Abstract
Sequence labeling is an important technique employed for many Natural Language Processing (NLP) tasks, such as Named Entity Recognition (NER), slot tagging for dialog systems and semantic parsing. Large-scale pre-trained language models obtain very good performance on these tasks when fine-tuned on large amounts of task-specific labeled data. However, such large-scale labeled datasets are difficult to obtain for several tasks and domains due to the high cost of human annotation as well as privacy and data access constraints for sensitive user applications. This is exacerbated for sequence labeling tasks requiring such annotations at token-level. In this work, we develop techniques to address the label scarcity challenge for neural sequence labeling models. Specifically, we develop self-training and meta-learning techniques for training neural sequence taggers with few labels. While…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
