Short-Term Word-Learning in a Dynamically Changing Environment
Christian Huber, Rishu Kumar, Ond\v{r}ej Bojar, Alexander Waibel

TL;DR
This paper explores dynamic methods for enhancing speech recognition systems to better identify new words like names and technical terms, balancing improved detection with false alarm risks.
Contribution
It introduces techniques for dynamically acquiring important words for recognition memory and analyzes the trade-offs involved.
Findings
Significant improvement in new word detection (F1 score 0.30 to 0.80)
Minor increase in false alarms when adding new words
Effective extraction of keywords from supporting documents
Abstract
Neural sequence-to-sequence automatic speech recognition (ASR) systems are in principle open vocabulary systems, when using appropriate modeling units. In practice, however, they often fail to recognize words not seen during training, e.g., named entities, numbers or technical terms. To alleviate this problem, Huber et al. proposed to supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly. In this paper we study, a) methods to acquire important words for this memory dynamically and, b) the trade-off between improvement in recognition accuracy of new words and the potential danger of false alarms for those added words. We demonstrate significant improvements in the detection rate of new words with only a minor increase in false alarms (F1 score 0.30 0.80), when using an appropriate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Neural Networks and Applications · Time Series Analysis and Forecasting
