Continual learning using lattice-free MMI for speech recognition
Hossein Hadian, Arseniy Gorin

TL;DR
This paper explores regularization-based continual learning methods for neural network acoustic models in speech recognition, demonstrating improved domain adaptation and reduced forgetting using sequence-level LWF with LF-MMI training.
Contribution
It introduces a sequence-level LWF regularization technique that leverages LF-MMI posteriors to enhance continual learning in speech recognition models.
Findings
Sequence-level LWF improves average WER by up to 9.4% relative.
Regular LWF and EWC help mitigate catastrophic forgetting.
The approach effectively adapts models to multiple speech domains.
Abstract
Continual learning (CL), or domain expansion, recently became a popular topic for automatic speech recognition (ASR) acoustic modeling because practical systems have to be updated frequently in order to work robustly on types of speech not observed during initial training. While sequential adaptation allows tuning a system to a new domain, it may result in performance degradation on the old domains due to catastrophic forgetting. In this work we explore regularization-based CL for neural network acoustic models trained with the lattice-free maximum mutual information (LF-MMI) criterion. We simulate domain expansion by incrementally adapting the acoustic model on different public datasets that include several accents and speaking styles. We investigate two well-known CL techniques, elastic weight consolidation (EWC) and learning without forgetting (LWF), which aim to reduce forgetting by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
