Automatic tagging using deep convolutional neural networks

Keunwoo Choi; George Fazekas; Mark Sandler

arXiv:1606.00298·cs.SD·June 2, 2016·223 cites

Automatic tagging using deep convolutional neural networks

Keunwoo Choi, George Fazekas, Mark Sandler

PDF

Open Access 5 Repos

TL;DR

This paper introduces a deep convolutional neural network approach for automatic music tagging, demonstrating that deeper models and mel-spectrogram inputs improve tagging accuracy on large datasets.

Contribution

It presents the first comprehensive evaluation of fully convolutional neural networks for music tagging, highlighting the effectiveness of deeper architectures and mel-spectrograms.

Findings

01

Deeper models outperform shallower ones on large datasets.

02

Mel-spectrograms are effective for music tagging.

03

State-of-the-art performance achieved with 4-layer architecture.

Abstract

We present a content-based automatic music tagging algorithm using fully convolutional neural networks (FCNs). We evaluate different architectures consisting of 2D convolutional layers and subsampling layers only. In the experiments, we measure the AUC-ROC scores of the architectures with different complexities and input types using the MagnaTagATune dataset, where a 4-layer architecture shows state-of-the-art performance with mel-spectrogram input. Furthermore, we evaluated the performances of the architectures with varying the number of layers on a larger dataset (Million Song Dataset), and found that deeper models outperformed the 4-layer architecture. The experiments show that mel-spectrogram is an effective time-frequency representation for automatic tagging and that more complex models benefit from more training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Video Analysis and Summarization