Convolutional Recurrent Neural Networks for Music Classification
Keunwoo Choi, George Fazekas, Mark Sandler, Kyunghyun Cho

TL;DR
This paper presents a hybrid convolutional recurrent neural network (CRNN) architecture for music tagging, combining CNNs for local feature extraction and RNNs for temporal summarisation, demonstrating superior efficiency and performance.
Contribution
It introduces a novel CRNN model for music classification that outperforms traditional CNNs in efficiency and effectiveness, with comprehensive comparisons and analysis.
Findings
CRNN achieves better performance with fewer parameters.
CRNN reduces training time compared to CNNs.
Hybrid structure enhances music feature extraction.
Abstract
We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features. We compare CRNN with three CNN structures that have been used for music tagging while controlling the number of parameters with respect to their performance and training time per sample. Overall, we found that CRNNs show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
