TGIF: Talker Group-Informed Familiarization of Target Speaker Extraction
Tsun-An Hsieh, Minje Kim

TL;DR
This paper introduces TGIF, a novel approach for target speaker extraction that specializes in specific talker groups using knowledge distillation, improving performance for personalized, small-group scenarios.
Contribution
The paper proposes TGIF, a group-informed familiarization method for TSE that adapts models to specific talker groups via knowledge distillation, addressing personalization without requiring clean speech targets.
Findings
Outperforms baseline generic models in talker group scenarios
Effective adaptation to speech characteristics of specific talker groups
Maintains computational efficiency in specialized TSE models
Abstract
State-of-the-art target speaker extraction (TSE) systems are typically designed to generalize to any given mixing environment, necessitating a model with a large enough capacity as a generalist. Personalized speech enhancement could be a specialized solution that adapts to single-user scenarios, but it overlooks the practical need for customization in cases where only a small number of talkers are involved, e.g., TSE for a specific family. We address this gap with the proposed concept, talker group-informed familiarization (TGIF) of TSE, where the TSE system specializes in a particular group of users, which is challenging due to the inherent absence of a clean speech target. To this end, we employ a knowledge distillation approach, where a group-specific student model learns from the pseudo-clean targets generated by a large teacher model. This tailors the student model to effectively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Speech and dialogue systems
