And what if two musical versions don't share melody, harmony, rhythm, or lyrics ?
Mathilde Abrassart, Guillaume Doras

TL;DR
This paper presents a metric learning-based music version identification system that combines melodic, harmonic, rhythmic, and lyrical features, achieving state-of-the-art results and suggesting near-optimal performance potential.
Contribution
It introduces a simple yet effective model leveraging four musical dimensions for version identification, with lyrics as a key discriminative feature.
Findings
Achieved state-of-the-art performance on two datasets.
Demonstrated the complementarity of multiple musical features.
Proposed that combining these features can reach optimal performance.
Abstract
Version identification (VI) has seen substantial progress over the past few years. On the one hand, the introduction of the metric learning paradigm has favored the emergence of scalable yet accurate VI systems. On the other hand, using features focusing on specific aspects of musical pieces, such as melody, harmony, or lyrics, yielded interpretable and promising performances. In this work, we build upon these recent advances and propose a metric learning-based system systematically leveraging four dimensions commonly admitted to convey musical similarity between versions: melodic line, harmonic structure, rhythmic patterns, and lyrics. We describe our deliberately simple model architecture, and we show in particular that an approximated representation of the lyrics is an efficient proxy to discriminate between versions and non-versions. We then describe how these features complement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Musicological Studies · Music Technology and Sound Studies
