Multi-Task Audio Source Separation

Lu Zhang; Chenxing Li; Feng Deng; and Xiaorui Wang

arXiv:2107.06467·eess.AS·July 15, 2021·1 cites

Multi-Task Audio Source Separation

Lu Zhang, Chenxing Li, Feng Deng, and Xiaorui Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new multi-task audio source separation challenge, proposes a complex domain model with residual compensation, and demonstrates its superior performance in separating speech, music, and noise from monaural mixtures.

Contribution

It presents a novel multi-task separation framework, a new dataset, and shows improved results over existing models in separating multiple audio sources.

Findings

01

The complex ratio mask is effective for multi-task separation.

02

Residual signal compensation improves separation quality.

03

The proposed model outperforms several well-known separation models.

Abstract

The audio source separation tasks, such as speech enhancement, speech separation, and music source separation, have achieved impressive performance in recent studies. The powerful modeling capabilities of deep neural networks give us hope for more challenging tasks. This paper launches a new multi-task audio source separation (MTASS) challenge to separate the speech, music, and noise signals from the monaural mixture. First, we introduce the details of this task and generate a dataset of mixtures containing speech, music, and background noises. Then, we propose an MTASS model in the complex domain to fully utilize the differences in spectral characteristics of the three audio signals. In detail, the proposed model follows a two-stage pipeline, which separates the three types of audio signals and then performs signal compensation separately. After comparing different training targets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Windstudent/Complex-MTASSNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques