Multi-scenario deep learning for multi-speaker source separation

Jeroen Zegers; Hugo Van hamme

arXiv:1808.08095·cs.LG·August 27, 2018

Multi-scenario deep learning for multi-speaker source separation

Jeroen Zegers, Hugo Van hamme

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that training a single deep learning model on multiple multi-speaker scenarios can match the performance of models trained on specific scenarios, highlighting the importance of diverse data in source separation tasks.

Contribution

It introduces a multi-scenario training approach for deep learning models in multi-speaker source separation, showing its effectiveness across different scenarios.

Findings

01

Data from one scenario improves performance in another.

02

Single multi-scenario model matches scenario-specific models.

03

Diverse training data enhances model generalization.

Abstract

Research in deep learning for multi-speaker source separation has received a boost in the last years. However, most studies are restricted to mixtures of a specific number of speakers, called a specific scenario. While some works included experiments for different scenarios, research towards combining data of different scenarios or creating a single model for multiple scenarios have been very rare. In this work it is shown that data of a specific scenario is relevant for solving another scenario. Furthermore, it is concluded that a single model, trained on different scenarios is capable of matching performance of scenario specific models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JeroenZegers/Nabu-MSSS
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing