Multi-scenario deep learning for multi-speaker source separation
Jeroen Zegers, Hugo Van hamme

TL;DR
This paper demonstrates that training a single deep learning model on multiple multi-speaker scenarios can match the performance of models trained on specific scenarios, highlighting the importance of diverse data in source separation tasks.
Contribution
It introduces a multi-scenario training approach for deep learning models in multi-speaker source separation, showing its effectiveness across different scenarios.
Findings
Data from one scenario improves performance in another.
Single multi-scenario model matches scenario-specific models.
Diverse training data enhances model generalization.
Abstract
Research in deep learning for multi-speaker source separation has received a boost in the last years. However, most studies are restricted to mixtures of a specific number of speakers, called a specific scenario. While some works included experiments for different scenarios, research towards combining data of different scenarios or creating a single model for multiple scenarios have been very rare. In this work it is shown that data of a specific scenario is relevant for solving another scenario. Furthermore, it is concluded that a single model, trained on different scenarios is capable of matching performance of scenario specific models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
