TL;DR
This paper introduces a novel RNNLM approximation method for subword units that improves out-of-vocabulary keyword search by better handling data sparsity and long-span dependencies.
Contribution
It proposes a new RNNLM approximation technique that produces variable-order n-grams, enhancing OOV recognition in spoken keyword search systems.
Findings
Interpolating RNNLM approximation with conventional models improves OOV recognition.
The new approximation method outperforms baseline models on Arabic and Finnish keyword search tasks.
Enhanced models achieve higher maximum term weighted value for subword units.
Abstract
In spoken Keyword Search, the query may contain out-of-vocabulary (OOV) words not observed when training the speech recognition system. Using subword language models (LMs) in the first-pass recognition makes it possible to recognize the OOV words, but even the subword n-gram LMs suffer from data sparsity. Recurrent Neural Network (RNN) LMs alleviate the sparsity problems but are not suitable for first-pass recognition as such. One way to solve this is to approximate the RNNLMs by back-off n-gram models. In this paper, we propose to interpolate the conventional n-gram models and the RNNLM approximation for better OOV recognition. Furthermore, we develop a new RNNLM approximation method suitable for subword units: It produces variable-order n-grams to include long-span approximations and considers also n-grams that were not originally observed in the training corpus. To evaluate these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
