Learning from Miscellaneous Other-Class Words for Few-shot Named Entity Recognition
Meihan Tong, Shuai Wang, Bin Xu, Yixin Cao, Minghui Liu, Lei Hou,, Juanzi Li

TL;DR
This paper introduces MUCO, a novel approach for few-shot NER that automatically induces undefined classes from other-class words, improving classification accuracy and semantic understanding in low-data scenarios.
Contribution
The paper proposes MUCO, a new model that enhances few-shot NER by leveraging undefined classes from other-class words, addressing overfitting and semantic differentiation issues.
Findings
Outperforms five state-of-the-art models in 1-shot and 5-shot settings
Improves discriminative ability of NER classifiers
Enhances understanding of predefined classes with extra semantic knowledge
Abstract
Few-shot Named Entity Recognition (NER) exploits only a handful of annotations to identify and classify named entity mentions. Prototypical network shows superior performance on few-shot NER. However, existing prototypical methods fail to differentiate rich semantics in other-class words, which will aggravate overfitting under few shot scenario. To address the issue, we propose a novel model, Mining Undefined Classes from Other-class (MUCO), that can automatically induce different undefined classes from the other class to improve few-shot NER. With these extra-labeled undefined classes, our method will improve the discriminative ability of NER classifier and enhance the understanding of predefined classes with stand-by semantic knowledge. Experimental results demonstrate that our model outperforms five state-of-the-art models in both 1-shot and 5-shots settings on four NER benchmarks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
