To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation
Abdul Waheed, Karima Kadaoui, Muhammad Abdul-Mageed

TL;DR
This paper explores the effectiveness of knowledge distillation in improving Arabic speech recognition models, introducing a new dialectal dataset and demonstrating that distilled models can outperform larger, state-of-the-art models on dialectal and standard benchmarks.
Contribution
It presents a novel Arabic speech recognition dataset for dialects and evaluates a distilled model that surpasses larger models in accuracy and efficiency.
Findings
Distilled model achieves 45.0% WER, outperforming larger models.
New dialectal dataset improves evaluation of ASR models on under-represented dialects.
Error analysis reveals common mistakes and challenges in dialectal speech recognition.
Abstract
Arabic is known to present unique challenges for Automatic Speech Recognition (ASR). On one hand, its rich linguistic diversity and wide range of dialects complicate the development of robust, inclusive models. On the other, current multilingual ASR models are compute-intensive and lack proper comprehensive evaluations. In light of these challenges, we distill knowledge from large teacher models into smaller student variants that are more efficient. We also introduce a novel human-annotated dataset covering five under-represented Arabic dialects for evaluation. We further evaluate both our models and existing SoTA multilingual models on both standard available benchmarks and our new dialectal data. Our best-distilled model's overall performance (\% WER) surpasses that of a SoTA model twice its size (SeamlessM4T-large-v2, WER=\%) and its teacher model (Whisper-large-v2,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference · AI-based Problem Solving and Planning · Machine Learning and Algorithms
