HTMD-Net: A Hybrid Masking-Denoising Approach to Time-Domain Monaural   Singing Voice Separation

Christos Garoufis; Athanasia Zlatintsi; Petros Maragos

arXiv:2103.04336·eess.AS·March 9, 2021

HTMD-Net: A Hybrid Masking-Denoising Approach to Time-Domain Monaural Singing Voice Separation

Christos Garoufis, Athanasia Zlatintsi, Petros Maragos

PDF

Open Access

TL;DR

HTMD-Net introduces a hybrid time-domain method combining masking and denoising to improve singing voice separation, reducing artifacts and increasing efficiency compared to pure masking approaches.

Contribution

The paper presents a novel hybrid approach that integrates masking and denoising modules with skip connections for monaural singing voice separation.

Findings

01

Achieves competitive separation performance on musdb18 dataset.

02

Reduces artifacts during silent segments.

03

Offers higher computational efficiency.

Abstract

The advent of deep learning has led to the prevalence of deep neural network architectures for monaural music source separation, with end-to-end approaches that operate directly on the waveform level increasingly receiving research attention. Among these approaches, transformation of the input mixture to a learned latent space, and multiplicative application of a soft mask to the latent mixture, achieves the best performance, but is prone to the introduction of artifacts to the source estimate. To alleviate this problem, in this paper we propose a hybrid time-domain approach, termed the HTMD-Net, combining a lightweight masking component and a denoising module, based on skip connections, in order to refine the source estimated by the masking procedure. Evaluation of our approach in the task of monaural singing voice separation in the musdb18 dataset indicates that our proposed method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Acoustic Wave Phenomena Research