Domain Expansion in DNN-based Acoustic Models for Robust Speech Recognition
Shahram Ghorbani, Soheil Khorram, John H.L. Hansen

TL;DR
This paper explores domain expansion techniques for DNN acoustic models in speech recognition, focusing on adapting to new accents with minimal forgetting, by proposing and evaluating several constraint-based adaptation methods.
Contribution
The study introduces and compares four domain expansion techniques, including a hybrid method, to improve accent adaptation in DNN acoustic models without retraining on all data.
Findings
SKLD outperforms EWC in accent adaptation
EWC surpasses WCA in effectiveness
Hybrid SKLD-EWC yields the best overall results
Abstract
Training acoustic models with sequentially incoming data -- while both leveraging new data and avoiding the forgetting effect-- is an essential obstacle to achieving human intelligence level in speech recognition. An obvious approach to leverage data from a new domain (e.g., new accented speech) is to first generate a comprehensive dataset of all domains, by combining all available data, and then use this dataset to retrain the acoustic models. However, as the amount of training data grows, storing and retraining on such a large-scale dataset becomes practically impossible. To deal with this problem, in this study, we study several domain expansion techniques which exploit only the data of the new domain to build a stronger model for all domains. These techniques are aimed at learning the new domain with a minimal forgetting effect (i.e., they maintain original model performance). These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsElastic Weight Consolidation
