Few-Shot Named Entity Recognition: A Comprehensive Study

Jiaxin Huang; Chunyuan Li; Krishan Subudhi; Damien Jose; Shobana; Balakrishnan; Weizhu Chen; Baolin Peng; Jianfeng Gao; Jiawei Han

arXiv:2012.14978·cs.CL·January 1, 2021·51 cites

Few-Shot Named Entity Recognition: A Comprehensive Study

Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana, Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, Jiawei Han

PDF

Open Access 2 Repos

TL;DR

This paper systematically explores methods to enhance few-shot NER performance using Transformer-based models, combining meta-learning, supervised pre-training, and self-training, achieving state-of-the-art results on multiple datasets.

Contribution

It introduces and empirically evaluates a comprehensive set of schemes for few-shot NER, including meta-learning, noisy data pre-training, and self-training, demonstrating their effectiveness.

Findings

01

Proposed schemes outperform baseline in few-shot NER tasks.

02

Achieved new state-of-the-art results on multiple datasets.

03

Provided insights for future research in low-resource NER.

Abstract

This paper presents a comprehensive study to efficiently build named entity recognition (NER) systems when a small number of in-domain labeled data is available. Based upon recent Transformer-based self-supervised pre-trained language models (PLMs), we investigate three orthogonal schemes to improve the model generalization ability for few-shot settings: (1) meta-learning to construct prototypes for different entity types, (2) supervised pre-training on noisy web data to extract entity-related generic representations and (3) self-training to leverage unlabeled in-domain data. Different combinations of these schemes are also considered. We perform extensive empirical comparisons on 10 public NER datasets with various proportions of labeled data, suggesting useful insights for future research. Our experiments show that (i) in the few-shot learning setting, the proposed NER schemes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning