Accurate and Scalable Version Identification Using Musically-Motivated   Embeddings

Furkan Yesiler; Joan Serr\`a; Emilia G\'omez

arXiv:1910.12551·cs.SD·April 14, 2020

Accurate and Scalable Version Identification Using Musically-Motivated Embeddings

Furkan Yesiler, Joan Serr\`a, Emilia G\'omez

PDF

1 Repo

TL;DR

This paper introduces MOVE, a musically-motivated embedding method that enhances accuracy and scalability in version identification by employing innovative representations, a triplet loss, and data augmentation, achieving state-of-the-art results.

Contribution

MOVE is the first approach to combine musically-motivated embeddings with scalable triplet loss training for version identification.

Findings

01

Achieves state-of-the-art performance on benchmark datasets

02

Demonstrates the effectiveness of temporal content summarization

03

Shows the impact of embedding dimensionality on performance

Abstract

The version identification (VI) task deals with the automatic detection of recordings that correspond to the same underlying musical piece. Despite many efforts, VI is still an open problem, with much room for improvement, specially with regard to combining accuracy and scalability. In this paper, we present MOVE, a musically-motivated method for accurate and scalable version identification. MOVE achieves state-of-the-art performance on two publicly-available benchmark sets by learning scalable embeddings in an Euclidean distance space, using a triplet loss and a hard triplet mining strategy. It improves over previous work by employing an alternative input representation, and introducing a novel technique for temporal content summarization, a standardized latent space, and a data augmentation strategy specifically designed for VI. In addition to the main results, we perform an ablation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

furkanyesiler/move
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTriplet Loss