MMS-MSG: A Multi-purpose Multi-Speaker Mixture Signal Generator

Tobias Cord-Landwehr; Thilo von Neumann; Christoph Boeddeker; Reinhold; Haeb-Umbach

arXiv:2209.11494·eess.AS·September 26, 2022·IWAENC

MMS-MSG: A Multi-purpose Multi-Speaker Mixture Signal Generator

Tobias Cord-Landwehr, Thilo von Neumann, Christoph Boeddeker, Reinhold, Haeb-Umbach

PDF

Open Access 1 Repo

TL;DR

MMS-MSG is a flexible, modular signal generator that creates diverse multi-speaker speech mixtures from any corpus, aiding development, evaluation, and training of speech enhancement systems in various environments.

Contribution

It introduces a versatile, extendable tool for generating realistic multi-speaker mixtures across different acoustic scenarios from any speech corpus.

Findings

01

Generated datasets enable effective training and evaluation.

02

Baseline results demonstrate the utility of MMS-MSG in real-world scenarios.

03

Flexible simulation of diverse acoustic environments.

Abstract

The scope of speech enhancement has changed from a monolithic view of single, independent tasks, to a joint processing of complex conversational speech recordings. Training and evaluation of these single tasks requires synthetic data with access to intermediate signals that is as close as possible to the evaluation scenario. As such data often is not available, many works instead use specialized databases for the training of each system component, e.g WSJ0-mix for source separation. We present a Multi-purpose Multi-Speaker Mixture Signal Generator (MMS-MSG) for generating a variety of speech mixture signals based on any speech corpus, ranging from classical anechoic mixtures (e.g., WSJ0-mix) over reverberant mixtures (e.g., SMS-WSJ) to meeting-style data. Its highly modular and flexible structure allows for the simulation of diverse environments and dynamic mixing, while simultaneously…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fgnt/mms_msg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Speech and dialogue systems