Content-based Music Similarity with Triplet Networks

Joseph Cleveland; Derek Cheng; Michael Zhou; Thorsten Joachims,; Douglas Turnbull

arXiv:2008.04938·cs.LG·December 8, 2022·1 cites

Content-based Music Similarity with Triplet Networks

Joseph Cleveland, Derek Cheng, Michael Zhou, Thorsten Joachims,, Douglas Turnbull

PDF

Open Access

TL;DR

This paper investigates using triplet neural networks to embed songs based on content similarity, comparing different triplet selection methods, and demonstrating initial success in artist retrieval tasks.

Contribution

It introduces a triplet network approach for music embedding and compares random versus genre-based triplet selection methods.

Findings

01

Shallow Siamese networks can embed music for artist retrieval.

02

Genre-based triplet selection improves embedding quality.

03

Initial results show feasibility for content-based music similarity.

Abstract

We explore the feasibility of using triplet neural networks to embed songs based on content-based music similarity. Our network is trained using triplets of songs such that two songs by the same artist are embedded closer to one another than to a third song by a different artist. We compare two models that are trained using different ways of picking this third song: at random vs. based on shared genre labels. Our experiments are conducted using songs from the Free Music Archive and use standard audio features. The initial results show that shallow Siamese networks can be used to embed music for a simple artist retrieval task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing