Cover Detection using Dominant Melody Embeddings
Guillaume Doras, Geoffroy Peeters

TL;DR
This paper introduces a neural network-based approach for cover song detection that uses dominant melody embeddings, achieving high accuracy and scalability for large audio databases by representing tracks as single vectors.
Contribution
The paper presents a novel neural network architecture that extracts dominant melody embeddings for efficient and accurate cover detection at scale.
Findings
Improved accuracy over state-of-the-art methods on small and large datasets.
Scalable to thousands of tracks with query times of a few seconds.
Embedding extraction can be done offline, enabling fast pairwise comparisons.
Abstract
Automatic cover detection -- the task of finding in an audio database all the covers of one or several query tracks -- has long been seen as a challenging theoretical problem in the MIR community and as an acute practical problem for authors and composers societies. Original algorithms proposed for this task have proven their accuracy on small datasets, but are unable to scale up to modern real-life audio corpora. On the other hand, faster approaches designed to process thousands of pairwise comparisons resulted in lower accuracy, making them unsuitable for practical use. In this work, we propose a neural network architecture that is trained to represent each track as a single embedding vector. The computation burden is therefore left to the embedding extraction -- that can be conducted offline and stored, while the pairwise comparison task reduces to a simple Euclidean distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech Recognition and Synthesis
