Deep Transfer Learning for Automatic Speech Recognition: Towards Better   Generalization

Hamza Kheddar; Yassine Himeur; Somaya Al-Maadeed; Abbes Amira; Faycal; Bensaali

arXiv:2304.14535·cs.SD·August 1, 2023·6 cites

Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

Hamza Kheddar, Yassine Himeur, Somaya Al-Maadeed, Abbes Amira, Faycal, Bensaali

PDF

Open Access

TL;DR

This paper surveys deep transfer learning approaches in automatic speech recognition, highlighting recent developments, analyzing frameworks, and identifying challenges and future opportunities for improving model generalization with limited data.

Contribution

It provides a comprehensive taxonomy and critical analysis of DTL-based ASR frameworks, offering insights into current challenges and future research directions.

Findings

01

Highlights the effectiveness of DTL in low-resource ASR scenarios

02

Identifies key limitations of current DTL frameworks

03

Suggests future research opportunities for better generalization

Abstract

Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing