Zero-Shot Learning in Named-Entity Recognition with External Knowledge
Nguyen Van Hoang, Soeren Hougaard Mulvad, Dexter Neo Yuan Rong, and Yang Yue

TL;DR
This paper introduces ZERO, a zero-shot NER model that leverages external semantic knowledge to recognize unseen entities in new domains, addressing the generalization challenge of current systems.
Contribution
ZERO is a novel approach that combines contextualized embeddings with external knowledge for zero-shot and few-shot NER, improving domain generalization.
Findings
ZERO achieves an average macro F1 score of 0.23 on unseen domains.
ZERO outperforms LUKE in few-shot learning scenarios.
Performance correlates inversely with KL divergence between source and target domains.
Abstract
A significant shortcoming of current state-of-the-art (SOTA) named-entity recognition (NER) systems is their lack of generalization to unseen domains, which poses a major problem since obtaining labeled data for NER in a new domain is expensive and time-consuming. We propose ZERO, a model that performs zero-shot and few-shot learning in NER to generalize to unseen domains by incorporating pre-existing knowledge in the form of semantic word embeddings. ZERO first obtains contextualized word representations of input sentences using the model LUKE, reduces their dimensionality, and compares them directly with the embeddings of the external knowledge, allowing ZERO to be trained to recognize unseen output entities. We find that ZERO performs well on unseen NER domains with an average macro F1 score of 0.23, outperforms LUKE in few-shot learning, and even achieves competitive scores on an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
