Audio Processing using Pattern Recognition for Music Genre   Classification

Sivangi Chatterjee; Srishti Ganguly; Avik Bose; Hrithik Raj Prasad,; Arijit Ghosal

arXiv:2410.14990·cs.SD·October 22, 2024

Audio Processing using Pattern Recognition for Music Genre Classification

Sivangi Chatterjee, Srishti Ganguly, Avik Bose, Hrithik Raj Prasad,, Arijit Ghosal

PDF

Open Access

TL;DR

This paper applies machine learning techniques to classify music genres from audio features, achieving over 92% accuracy with neural networks, and aims to enhance music recommendation systems.

Contribution

It introduces a neural network-based approach for genre classification using key audio features, demonstrating superior performance over other algorithms.

Findings

01

ANN achieved 92.44% validation accuracy

02

Spectral features like MFCCs improved model performance

03

Neural networks outperformed Logistic Regression, KNN, and Random Forest

Abstract

This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset, which contains 100 audio files per genre. Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country-using a variety of algorithms including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN) implemented via Keras. The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%. We also analyzed key audio features such as spectral roll-off, spectral centroid, and MFCCs, which helped enhance the model's accuracy. Future work will expand the model to cover all ten genres, investigate advanced methods like Long Short-Term Memory (LSTM) networks and ensemble approaches, and develop a web application…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing

MethodsLogistic Regression