Comparison of spectrogram scaling in multi-label Music Genre Recognition

Bartosz Karpi\'nski; Cyryl Leszczy\'nski

arXiv:2506.02091·cs.SD·June 4, 2025

Comparison of spectrogram scaling in multi-label Music Genre Recognition

Bartosz Karpi\'nski, Cyryl Leszczy\'nski

PDF

Open Access

TL;DR

This paper compares different spectrogram scaling methods and training approaches for multi-label music genre recognition, using a large, manually labeled dataset to evaluate their effectiveness in handling genre complexity.

Contribution

It introduces a comprehensive comparison of preprocessing and training methods tailored for multi-label music genre recognition with an extensive dataset.

Findings

01

Certain spectrogram scaling methods outperform others in genre classification accuracy

02

Preprocessing choices significantly impact model performance in multi-label tasks

03

The study provides insights into optimal approaches for genre recognition in complex musical datasets

Abstract

As the accessibility and ease-of-use of digital audio workstations increases, so does the quantity of music available to the average listener; additionally, differences between genres are not always well defined and can be abstract, with widely varying combinations of genres across individual records. In this article, multiple preprocessing methods and approaches to model training are described and compared, accounting for the eclectic nature of today's albums. A custom, manually labeled dataset of more than 18000 entries has been used to perform the experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing