On the Robustness of Cover Version Identification Models: A Study Using Cover Versions from YouTube
Simon Hachmeier, Robert J\"aschke

TL;DR
This study evaluates the robustness of cover song identification models on YouTube cover versions, revealing significant performance drops and identifying challenging alterations, thus highlighting limitations of current models in real-world scenarios.
Contribution
The paper introduces a new YouTube-based dataset for cover song identification and provides a taxonomy of alterations, assessing model performance on more diverse, real-world cover versions.
Findings
Models perform significantly worse on YouTube cover versions compared to community datasets.
Certain types of cover versions, like instrumental versions, are particularly difficult for models to identify.
A taxonomy of alterations in online cover versions is proposed.
Abstract
Recent advances in cover song identification have shown great success. However, models are usually tested on a fixed set of datasets which are relying on the online cover song database SecondHandSongs. It is unclear how well models perform on cover songs on online video platforms, which might exhibit alterations that are not expected. In this paper, we annotate a subset of songs from YouTube sampled by a multi-modal uncertainty sampling approach and evaluate state-of-the-art models. We find that existing models achieve significantly lower ranking performance on our dataset compared to a community dataset. We additionally measure the performance of different types of versions (e.g., instrumental versions) and find several types that are particularly hard to rank. Lastly, we provide a taxonomy of alterations in cover versions on the web.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security · Video Analysis and Summarization
MethodsSparse Evolutionary Training
