On the Use of External Data for Spoken Named Entity Recognition

Ankita Pasad; Felix Wu; Suwon Shon; Karen Livescu; Kyu J. Han

arXiv:2112.07648·cs.CL·July 12, 2022·1 cites

On the Use of External Data for Spoken Named Entity Recognition

Ankita Pasad, Felix Wu, Suwon Shon, Karen Livescu, Kyu J. Han

PDF

Open Access 1 Repo

TL;DR

This paper explores leveraging external speech and text data to improve low-resource spoken named entity recognition, demonstrating that such approaches can significantly outperform pre-trained models alone.

Contribution

It introduces methods for utilizing external data beyond self-supervised pre-training to enhance spoken NER performance in resource-limited settings.

Findings

01

External data approaches improve F1 scores by up to 16%.

02

End-to-end models outperform pipeline models with external data.

03

End-to-end models focus more on NER-specific words.

Abstract

Spoken language understanding (SLU) tasks involve mapping from speech audio signals to semantic labels. Given the complexity of such tasks, good performance might be expected to require large labeled datasets, which are difficult to collect for each new task and domain. However, recent advances in self-supervised speech representations have made it feasible to consider learning SLU models with limited labeled data. In this work we focus on low-resource spoken named entity recognition (NER) and address the question: Beyond self-supervised pre-training, how can we use external speech and/or text data that are not annotated for the task? We draw on a variety of approaches, including self-training, knowledge distillation, and transfer learning, and consider their applicability to both end-to-end models and pipeline (speech recognition followed by text NER model) approaches. We find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

asappresearch/spoken-ner
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques