Transfer learning for music classification and regression tasks
Keunwoo Choi, Gy\"orgy Fazekas, Mark Sandler, Kyunghyun Cho

TL;DR
This paper introduces a transfer learning method using a pre-trained convolutional network to improve music classification and regression, demonstrating superior performance over traditional features across multiple tasks.
Contribution
It presents a novel approach of using convnet features as general-purpose music representations for various classification and regression tasks.
Findings
Convnet features outperform MFCC in all tasks
Convnet features outperform previous aggregation methods
Transfer learning improves music task performance
Abstract
In this paper, we present a transfer learning approach for music classification and regression tasks. We propose to use a pre-trained convnet feature, a concatenated feature vector using the activations of feature maps of multiple layers in a trained convolutional network. We show how this convnet feature can serve as general-purpose music representation. In the experiments, a convnet is trained for music tagging and then transferred to other music-related classification and regression tasks. The convnet feature outperforms the baseline MFCC feature in all the considered tasks and several previous approaches that are aggregating MFCCs as well as low- and high-level music features.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
