Incorporating Class-based Language Model for Named Entity Recognition in   Factorized Neural Transducer

Peng Wang; Yifan Yang; Zheng Liang; Tian Tan; Shiliang Zhang; Xie Chen

arXiv:2309.07648·eess.AS·June 11, 2024

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

PDF

Open Access

TL;DR

This paper introduces C-FNT, an end-to-end model that integrates class-based language models into factorized neural transducers to improve named entity recognition without compromising overall speech recognition accuracy.

Contribution

It proposes a novel approach to incorporate class-based language models into FNT for enhanced NER performance in speech recognition.

Findings

01

Significant reduction in named entity errors.

02

Maintains overall word recognition performance.

03

Effective decoupling of acoustic and linguistic information.

Abstract

Despite advancements of end-to-end (E2E) models in speech recognition, named entity recognition (NER) is still challenging but critical for semantic understanding. Previous studies mainly focus on various rule-based or attention-based contextual biasing algorithms. However, their performance might be sensitive to the biasing weight or degraded by excessive attention to the named entity list, along with a risk of false triggering. Inspired by the success of the class-based language model (LM) in NER in conventional hybrid systems and the effective decoupling of acoustic and linguistic information in the factorized neural Transducer (FNT), we propose C-FNT, a novel E2E model that incorporates class-based LMs into FNT. In C-FNT, the LM score of named entities can be associated with the name class instead of its surface form. The experimental results show that our proposed C-FNT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques

MethodsFocus