Identifying Cover Songs Using Information-Theoretic Measures of Similarity
Peter Foster, Simon Dixon, Anssi Klapuri

TL;DR
This paper explores information-theoretic measures for quantifying audio similarity to improve cover song detection, demonstrating that continuous-valued approaches outperform discrete ones and achieving state-of-the-art results on large datasets.
Contribution
It introduces novel continuous and discrete information-theoretic similarity measures, including an improved normalised compression distance with alignment, for cover song identification.
Findings
Continuous-valued approaches outperform discrete ones.
The proposed NCDA improves over standard NCD.
Achieved state-of-the-art performance on the Million Song Dataset.
Abstract
This paper investigates methods for quantifying similarity between audio signals, specifically for the task of of cover song detection. We consider an information-theoretic approach, where we compute pairwise measures of predictability between time series. We compare discrete-valued approaches operating on quantised audio features, to continuous-valued approaches. In the discrete case, we propose a method for computing the normalised compression distance, where we account for correlation between time series. In the continuous case, we propose to compute information-based measures of similarity as statistics of the prediction error between time series. We evaluate our methods on two cover song identification tasks using a data set comprised of 300 Jazz standards and using the Million Song Dataset. For both datasets, we observe that continuous-valued approaches outperform discrete-valued…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
