MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for   Monaural Sound Source Separation

Konstantinos Drossos; Stylianos Ioannis Mimilakis; Dmitriy; Serdyuk; Gerald Schuller; Tuomas Virtanen; Yoshua Bengio

arXiv:1802.00300·cs.SD·February 2, 2018

MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation

Konstantinos Drossos, Stylianos Ioannis Mimilakis, Dmitriy, Serdyuk, Gerald Schuller, Tuomas Virtanen, Yoshua Bengio

PDF

2 Repos

TL;DR

This paper introduces MaD TwinNet, a deep learning architecture that improves monaural singing voice separation by modeling long-term musical structures using twin networks, achieving state-of-the-art results.

Contribution

It presents a novel combination of Masker-Denoiser architecture with Twin Networks to better capture long-term dependencies in music separation tasks.

Findings

01

Achieved 0.37 dB SDR improvement over previous SOTA.

02

Achieved 0.23 dB SIR improvement over previous SOTA.

03

Validated on Demixing Secret Dataset.

Abstract

Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current state of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel deep learning based method that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.