Singer separation for karaoke content generation

Hsuan-Yu Lin; Xuanjun Chen; Jyh-Shing Roger Jang

arXiv:2110.06707·cs.SD·August 20, 2024·1 cites

Singer separation for karaoke content generation

Hsuan-Yu Lin, Xuanjun Chen, Jyh-Shing Roger Jang

PDF

Open Access

TL;DR

This paper introduces a singer separation system tailored for karaoke content generation, capable of isolating one or two lead singers from music tracks, with a new dataset and automatic model selection for real-world applications.

Contribution

The paper presents three novel models for singer separation, an automatic model selection scheme, and a new large dataset, MIR-SingerSeparation, for karaoke-focused singer separation research.

Findings

01

Effective separation of lead singers in karaoke tracks

02

Models perform best on sentimental ballads

03

First real-world karaoke singer separation system

Abstract

Due to the rapid development of deep learning, we can now successfully separate singing voice from mono audio music. However, this separation can only extract human voices from other musical instruments, which is undesirable for karaoke content generation applications that only require the separation of lead singers. For this karaoke application, we need to separate the music containing male and female duets into two vocals, or extract a single lead vocal from the music containing vocal harmony. For this reason, we propose in this article to use a singer separation system, which generates karaoke content for one or two separated lead singers. In particular, we introduced three models for the singer separation task and designed an automatic model selection scheme to distinguish how many lead singers are in the song. We also collected a large enough data set, MIR-SingerSeparation, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis