Music Genre Classification using Large Language Models

Mohamed El Amine Meguenani; Alceu de Souza Britto Jr. and; Alessandro Lameiras Koerich

arXiv:2410.08321·cs.SD·October 14, 2024

Music Genre Classification using Large Language Models

Mohamed El Amine Meguenani, Alceu de Souza Britto Jr. and, Alessandro Lameiras Koerich

PDF

Open Access 1 Repo

TL;DR

This paper explores the use of large language models and transformer architectures for music genre classification, demonstrating that models like AST achieve high accuracy even in zero-shot scenarios.

Contribution

It introduces a novel approach combining LLMs with audio processing for genre classification and compares various models, highlighting the superior performance of transformer-based architectures.

Findings

01

AST model achieves 85.5% accuracy

02

Transformer-based models outperform CNNs and traditional methods

03

Zero-shot classification capability demonstrated

Abstract

This paper exploits the zero-shot capabilities of pre-trained large language models (LLMs) for music genre classification. The proposed approach splits audio signals into 20 ms chunks and processes them through convolutional feature encoders, a transformer encoder, and additional layers for coding audio units and generating feature vectors. The extracted feature vectors are used to train a classification head. During inference, predictions on individual chunks are aggregated for a final genre classification. We conducted a comprehensive comparison of LLMs, including WavLM, HuBERT, and wav2vec 2.0, with traditional deep learning architectures like 1D and 2D convolutional neural networks (CNNs) and the audio spectrogram transformer (AST). Our findings demonstrate the superior performance of the AST model, achieving an overall accuracy of 85.5%, surpassing all other models evaluated. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

megamine25/Music-genre-classification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Diverse Musicological Studies