Multimodal Lyrics-Rhythm Matching
Callie C. Liao, Duoduo Liao, Jesse Guessford

TL;DR
This paper introduces a novel multimodal approach for matching lyrics and rhythm in music, leveraging audio data to analyze correlations between lyrical components and rhythmic elements without language restrictions.
Contribution
It presents a new method that matches lyrics and rhythm using multimodal cues from audio, enabling cross-linguistic analysis and improving understanding of lyrics-rhythm relationships.
Findings
Average matching probability of 0.81 across songs
30% of songs have keywords landing on strong beats with probability ≥0.9
Nearly 50% of songs show high similarity (≥0.70) between lyrics and rhythm
Abstract
Despite the recent increase in research on artificial intelligence for music, prominent correlations between key components of lyrics and rhythm such as keywords, stressed syllables, and strong beats are not frequently studied. This is likely due to challenges such as audio misalignment, inaccuracies in syllabic identification, and most importantly, the need for cross-disciplinary knowledge. To address this lack of research, we propose a novel multimodal lyrics-rhythm matching approach in this paper that specifically matches key components of lyrics and music with each other without any language limitations. We use audio instead of sheet music with readily available metadata, which creates more challenges yet increases the application flexibility of our method. Furthermore, our approach creatively generates several patterns involving various multimodalities, including music strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
