Harmonic-Percussive Source Separation with Deep Neural Networks and   Phase Recovery

Konstantinos Drossos; Paul Magron; Stylianos Ioannis Mimilakis; and; Tuomas Virtanen

arXiv:1807.11298·cs.SD·July 31, 2018·1 cites

Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery

Konstantinos Drossos, Paul Magron, Stylianos Ioannis Mimilakis, and, Tuomas Virtanen

PDF

Open Access

TL;DR

This paper introduces a deep learning approach using MaD TwinNet for harmonic-percussive source separation, incorporating phase recovery to improve the quality of separated sources in music mixtures.

Contribution

It adapts the MaD TwinNet architecture to HPSS and integrates a phase recovery algorithm, achieving state-of-the-art results in music source separation.

Findings

01

Outperforms previous kernel additive model approach.

02

Effective phase recovery enhances separation quality.

03

Deep neural network architecture improves harmonic-percussive separation.

Abstract

Harmonic/percussive source separation (HPSS) consists in separating the pitched instruments from the percussive parts in a music mixture. In this paper, we propose to apply the recently introduced Masker-Denoiser with twin networks (MaD TwinNet) system to this task. MaD TwinNet is a deep learning architecture that has reached state-of-the-art results in monaural singing voice separation. Herein, we propose to apply it to HPSS by using it to estimate the magnitude spectrogram of the percussive source. Then, we retrieve the complex-valued short-time Fourier transform of the sources by means of a phase recovery algorithm, which minimizes the reconstruction error and enforces the phase of the harmonic part to follow a sinusoidal phase model. Experiments conducted on realistic music mixtures show that this novel separation system outperforms the previous state-of-the art kernel additive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Music Technology and Sound Studies