Music Genre Classification with ResNet and Bi-GRU Using Visual Spectrograms
Junfei Zhang

TL;DR
This paper introduces a hybrid deep learning model combining ResNet and Bi-GRU to classify music genres from visual spectrograms, aiming to improve accuracy over traditional methods by capturing both spatial and temporal features.
Contribution
The study presents a novel hybrid ResNet and Bi-GRU model that leverages spectrograms for more effective music genre classification, addressing limitations of previous approaches.
Findings
Enhanced genre classification accuracy
Effective capture of spatial and temporal features
Potential improvements for music recommender systems
Abstract
Music recommendation systems have emerged as a vital component to enhance user experience and satisfaction for the music streaming services, which dominates music consumption. The key challenge in improving these recommender systems lies in comprehending the complexity of music data, specifically for the underpinning music genre classification. The limitations of manual genre classification have highlighted the need for a more advanced system, namely the Automatic Music Genre Classification (AMGC) system. While traditional machine learning techniques have shown potential in genre classification, they heavily rely on manually engineered features and feature selection, failing to capture the full complexity of music data. On the other hand, deep learning classification architectures like the traditional Convolutional Neural Networks (CNN) are effective in capturing the spatial hierarchies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Neuroscience and Music Perception · Music Technology and Sound Studies
