DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech   Entity Linking

Shen Huang; Yuchen Zhai; Xinwei Long; Yong Jiang; Xiaobin Wang; Yin; Zhang; Pengjun Xie

arXiv:2209.13187·cs.CL·September 30, 2022

DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech Entity Linking

Shen Huang, Yuchen Zhai, Xinwei Long, Yong Jiang, Xiaobin Wang, Yin, Zhang, Pengjun Xie

PDF

Open Access 1 Repo

TL;DR

This paper introduces KENER, a knowledge-enhanced approach for robust speech entity linking that effectively handles noisy transcripts and speech variations, significantly improving recognition and disambiguation performance.

Contribution

The paper presents a novel knowledge-enhanced NER method that incorporates entity descriptions and retrieval strategies to improve robustness in speech entity linking.

Findings

01

Achieved 1st place in Track 1 of NLPCC-2022 Shared Task 2

02

Improved entity recognition accuracy in noisy speech transcripts

03

Enhanced disambiguation performance with knowledge integration

Abstract

Speech Entity Linking aims to recognize and disambiguate named entities in spoken languages. Conventional methods suffer gravely from the unfettered speech styles and the noisy transcripts generated by ASR systems. In this paper, we propose a novel approach called Knowledge Enhanced Named Entity Recognition (KENER), which focuses on improving robustness through painlessly incorporating proper knowledge in the entity recognition stage and thus improving the overall performance of entity linking. KENER first retrieves candidate entities for a sentence without mentions, and then utilizes the entity descriptions as extra information to help recognize mentions. The candidate entities retrieved by a dense retrieval module are especially useful when the input is short or noisy. Moreover, we investigate various data sampling strategies and design effective loss functions, in order to improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

modelscope/adaseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems