TL;DR
This paper provides a comprehensive overview of adaptation algorithms for neural network-based speech recognition, analyzing various methods like embeddings, parameter adaptation, and data augmentation, and summarizing their performance improvements.
Contribution
It offers a structured classification and meta-analysis of adaptation techniques for neural speech recognition systems, highlighting their relative effectiveness.
Findings
Embeddings and parameter adaptation are effective for speaker and domain adaptation.
Meta-analysis shows significant error rate reductions with certain adaptation methods.
The overview identifies gaps and future directions in speech recognition adaptation research.
Abstract
We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / neural network systems and end-to-end neural network systems, with a focus on speaker adaptation, domain adaptation, and accent adaptation. The overview characterizes adaptation algorithms as based on embeddings, model parameter adaptation, or data augmentation. We present a meta-analysis of the performance of speech recognition adaptation algorithms, based on relative error rate reductions as reported in the literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
