Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillation
Hassan Taherian, Sefik Emre Eskimez, and Takuya Yoshioka

TL;DR
This paper introduces a novel training framework for personalized speech enhancement that uses cross-task knowledge distillation and a pVAD to balance speech suppression and interference leakage, improving model performance.
Contribution
It proposes a new PSE training method leveraging cross-task knowledge distillation and pVAD to mitigate the trade-off between over-suppression and leakage.
Findings
Reduces interference leakage in silent target speaker segments
Balances speech suppression and interference leakage effectively
Improves PSE performance across various scenarios
Abstract
Personalized speech enhancement (PSE) models achieve promising results compared with unconditional speech enhancement models due to their ability to remove interfering speech in addition to background noise. Unlike unconditional speech enhancement, causal PSE models may occasionally remove the target speech by mistake. The PSE models also tend to leak interfering speech when the target speaker is silent for an extended period. We show that existing PSE methods suffer from a trade-off between speech over-suppression and interference leakage by addressing one problem at the expense of the other. We propose a new PSE model training framework using cross-task knowledge distillation to mitigate this trade-off. Specifically, we utilize a personalized voice activity detector (pVAD) during training to exclude the non-target speech frames that are wrongly identified as containing the target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Infant Health and Development
MethodsKnowledge Distillation
