Multi-Modal Chorus Recognition for Improving Song Search

Jiaan Wang; Zhixu Li; Binbin Gu; Tingyi Zhang; Qingsheng Liu and; Zhigang Chen

arXiv:2106.16153·cs.IR·July 1, 2021

Multi-Modal Chorus Recognition for Improving Song Search

Jiaan Wang, Zhixu Li, Binbin Gu, Tingyi Zhang, Qingsheng Liu and, Zhigang Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-modal chorus recognition model that leverages lyrics and tune information to improve song search and summarization, supported by a new dataset and empirical results showing enhanced accuracy.

Contribution

The paper presents the first chorus recognition dataset and a novel multi-modal model that outperforms baselines and improves downstream song search accuracy.

Findings

01

Our approach outperforms baseline models in chorus recognition.

02

Chorus recognition improves song search accuracy by over 10.6%.

03

The dataset contains 627 songs for public use.

Abstract

We discuss a novel task, Chorus Recognition, which could potentially benefit downstream tasks such as song search and music summarization. Different from the existing tasks such as music summarization or lyrics summarization relying on single-modal information, this paper models chorus recognition as a multi-modal one by utilizing both the lyrics and the tune information of songs. We propose a multi-modal Chorus Recognition model that considers diverse features. Besides, we also create and publish the first Chorus Recognition dataset containing 627 songs for public use. Our empirical study performed on the dataset demonstrates that our approach outperforms several baselines in chorus recognition. In addition, our approach also helps to improve the accuracy of its downstream task - song search by more than 10.6%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

krystalan/MMCR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Music Technology and Sound Studies