Developing Far-Field Speaker System Via Teacher-Student Learning
Jinyu Li, Rui Zhao, Zhuo Chen, Changliang Liu, Xiong Xiao, Guoli Ye,, and Yifan Gong

TL;DR
This paper presents a teacher-student learning approach to adapt and compress models for far-field speaker systems, improving accuracy and reducing model size without transcription.
Contribution
It introduces a novel use of T/S learning for both adapting acoustic models to far-field data and compressing keyword spotting models, enhancing performance and efficiency.
Findings
Acoustic model WER reduced by over 70% in some tests.
KWS model size decreased by 27 times with maintained accuracy.
Models trained with untranscribed data show significant performance gains.
Abstract
In this study, we develop the keyword spotting (KWS) and acoustic model (AM) components in a far-field speaker system. Specifically, we use teacher-student (T/S) learning to adapt a close-talk well-trained production AM to far-field by using parallel close-talk and simulated far-field data. We also use T/S learning to compress a large-size KWS model into a small-size one to fit the device computational cost. Without the need of transcription, T/S learning well utilizes untranscribed data to boost the model performance in both the AM adaptation and KWS model compression. We further optimize the models with sequence discriminative training and live data to reach the best performance of systems. The adapted AM improved from the baseline by 72.60% and 57.16% relative word error rate reduction on play-back and live test data, respectively. The final KWS model size was reduced by 27 times…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
MethodsAttention Model
