DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition
Zeqi Tan, Shen Huang, Zixia Jia, Jiong Cai, Yinghui Li, Weiming Lu,, Yueting Zhuang, Kewei Tu, Pengjun Xie, Fei Huang, Yong Jiang

TL;DR
This paper introduces U-RaNER, a unified retrieval-augmented system that leverages knowledge bases and improved retrieval strategies to enhance multilingual fine-grained NER performance, winning most tracks in the MultiCoNER 2 shared task.
Contribution
The paper presents a novel retrieval-augmented approach for multilingual NER that addresses knowledge insufficiency and limited context issues, outperforming previous systems.
Findings
U-RaNER wins 9 out of 13 tracks in MultiCoNER 2.
Incorporating Wikidata improves retrieval context.
ChatGPT underperforms compared to specialized NER systems.
Abstract
The MultiCoNER \RNum{2} shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios, and it inherits the semantic ambiguity and low-context setting of the MultiCoNER \RNum{1} task. To cope with these problems, the previous top systems in the MultiCoNER \RNum{1} either incorporate the knowledge bases or gazetteers. However, they still suffer from insufficient knowledge, limited context length, single retrieval strategy. In this paper, our team \textbf{DAMO-NLP} proposes a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER. We perform error analysis on the previous top systems and reveal that their performance bottleneck lies in insufficient knowledge. Also, we discover that the limited context length causes the retrieval knowledge to be invisible to the model. To enhance the retrieval context, we incorporate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
