Evaluation of CNN-based Automatic Music Tagging Models

Minz Won; Andres Ferraro; Dmitry Bogdanov; Xavier Serra

arXiv:2006.00751·eess.AS·June 2, 2020·47 cites

Evaluation of CNN-based Automatic Music Tagging Models

Minz Won, Andres Ferraro, Dmitry Bogdanov, Xavier Serra

PDF

Open Access 5 Repos

TL;DR

This paper provides a consistent evaluation of CNN-based music tagging models across multiple datasets, analyzing their robustness to input perturbations and offering reproducible implementations for future research.

Contribution

It offers a standardized comparison framework for CNN music tagging models and assesses their generalization under various input perturbations.

Findings

01

Models achieve comparable performance on standard metrics.

02

Perturbations reduce model accuracy, indicating sensitivity.

03

Reproducible pre-trained models are provided for future research.

Abstract

Recent advances in deep learning accelerated the development of content-based automatic music tagging systems. Music information retrieval (MIR) researchers proposed various architecture designs, mainly based on convolutional neural networks (CNNs), that achieve state-of-the-art results in this multi-label binary classification task. However, due to the differences in experimental setups followed by researchers, such as using different dataset splits and software versions for evaluation, it is difficult to compare the proposed architectures directly with each other. To facilitate further research, in this paper we conduct a consistent evaluation of different music tagging models on three datasets (MagnaTagATune, Million Song Dataset, and MTG-Jamendo) and provide reference results using common evaluation metrics (ROC-AUC and PR-AUC). Furthermore, all the models are evaluated with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Video Analysis and Summarization · Speech Recognition and Synthesis