Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition
Ji Won Yoon, Beom Jun Woo, Sunghwan Ahn, Hyeonseung Lee, and Nam Soo, Kim

TL;DR
This paper introduces Inter-KD, a novel knowledge distillation method for CTC-based speech recognition that transfers knowledge to intermediate layers, significantly improving performance without additional language models or data augmentation.
Contribution
The paper proposes Inter-KD, a new intermediate-layer knowledge distillation technique for CTC models, enhancing speech recognition accuracy efficiently.
Findings
Inter-KD outperforms conventional KD methods on LibriSpeech.
Inter-KD reduces WER from 8.85% to 6.30% on test-clean.
No language model or data augmentation needed.
Abstract
Recently, the advance in deep learning has brought a considerable improvement in the end-to-end speech recognition field, simplifying the traditional pipeline while producing promising results. Among the end-to-end models, the connectionist temporal classification (CTC)-based model has attracted research interest due to its non-autoregressive nature. However, such CTC models require a heavy computational cost to achieve outstanding performance. To mitigate the computational burden, we propose a simple yet effective knowledge distillation (KD) for the CTC framework, namely Inter-KD, that additionally transfers the teacher's knowledge to the intermediate CTC layers of the student network. From the experimental results on the LibriSpeech, we verify that the Inter-KD shows better achievements compared to the conventional KD methods. Without using any language model (LM) and data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
MethodsKnowledge Distillation
