Informed Group-Sparse Representation for Singing Voice Separation

Tak-Shing T. Chan; Yi-Hsuan Yang

arXiv:1801.03815·eess.AS·January 12, 2018

Informed Group-Sparse Representation for Singing Voice Separation

Tak-Shing T. Chan, Yi-Hsuan Yang

PDF

TL;DR

This paper introduces a fast, linear-time informed group-sparse representation method for singing voice separation, leveraging pitch annotations and group-sparse structures to improve efficiency and effectiveness in separating vocals from music.

Contribution

The paper presents a novel linear-time algorithm for informed group-sparse representation, enhancing singing voice separation with side information and extending to multiple dictionaries.

Findings

01

Effective separation on iKala dataset

02

Music accompaniment exhibits group-sparse structure

03

Method scalable to multiple dictionaries

Abstract

Singing voice separation attempts to separate the vocal and instrumental parts of a music recording, which is a fundamental problem in music information retrieval. Recent work on singing voice separation has shown that the low-rank representation and informed separation approaches are both able to improve separation quality. However, low-rank optimizations are computationally inefficient due to the use of singular value decompositions. Therefore, in this paper, we propose a new linear-time algorithm called informed group-sparse representation, and use it to separate the vocals from music using pitch annotations as side information. Experimental results on the iKala dataset confirm the efficacy of our approach, suggesting that the music accompaniment follows a group-sparse structure given a pre-trained instrumental dictionary. We also show how our work can be easily extended to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.