Improving Lyrics Alignment through Joint Pitch Detection

Jiawen Huang; Emmanouil Benetos; Sebastian Ewert

arXiv:2202.01646·cs.SD·February 4, 2022

Improving Lyrics Alignment through Joint Pitch Detection

Jiawen Huang, Emmanouil Benetos, Sebastian Ewert

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-task learning method that jointly uses pitch detection and lyrics alignment, leveraging accurate pitch annotations to improve the precision of lyrics timing in singing voice analysis.

Contribution

It presents a novel joint pitch detection and lyrics alignment framework that exploits musical properties often ignored by speech-based systems, enhancing alignment accuracy.

Findings

01

Improved lyrics alignment accuracy with joint pitch information

02

Boundary detection reduces cross-line errors

03

Enhanced alignment performance over traditional methods

Abstract

In recent years, the accuracy of automatic lyrics alignment methods has increased considerably. Yet, many current approaches employ frameworks designed for automatic speech recognition (ASR) and do not exploit properties specific to music. Pitch is one important musical attribute of singing voice but it is often ignored by current systems as the lyrics content is considered independent of the pitch. In practice, however, there is a temporal correlation between the two as note starts often correlate with phoneme starts. At the same time the pitch is usually annotated with high temporal accuracy in ground truth data while the timing of lyrics is often only available at the line (or word) level. In this paper, we propose a multi-task learning approach for lyrics alignment that incorporates pitch and thus can make use of a new source of highly accurate temporal information. Our results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jhuang448/lyricsalignment-mtl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing