AISHELL-NER: Named Entity Recognition from Chinese Speech

Boli Chen; Guangwei Xu; Xiaobin Wang; Pengjun Xie; Meishan Zhang; Fei; Huang

arXiv:2202.08533·cs.CL·February 18, 2022

AISHELL-NER: Named Entity Recognition from Chinese Speech

Boli Chen, Guangwei Xu, Xiaobin Wang, Pengjun Xie, Meishan Zhang, Fei, Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces AISHELL-NER, a new Chinese speech dataset for Named Entity Recognition, and demonstrates that combining entity-aware ASR with pretrained NER models improves performance in Chinese spoken language understanding.

Contribution

The paper presents a new Chinese speech NER dataset and evaluates methods that enhance NER accuracy by integrating entity-aware ASR with pretrained NER models.

Findings

01

Combining entity-aware ASR with pretrained NER improves NER performance.

02

The AISHELL-NER dataset is publicly available for research.

03

State-of-the-art methods show significant gains on Chinese speech NER.

Abstract

Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR. However, due to the many homophones and polyphones that exist in Chinese, NER from Chinese speech is effectively a more challenging task. In this paper, we introduce a new dataset AISEHLL-NER for NER from Chinese speech. Extensive experiments are conducted to explore the performance of several state-of-the-art methods. The results demonstrate that the performance could be improved by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alibaba-nlp/aishell-ner
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling