Comparison of spectrogram scaling in multi-label Music Genre Recognition
Bartosz Karpi\'nski, Cyryl Leszczy\'nski

TL;DR
This paper compares different spectrogram scaling methods and training approaches for multi-label music genre recognition, using a large, manually labeled dataset to evaluate their effectiveness in handling genre complexity.
Contribution
It introduces a comprehensive comparison of preprocessing and training methods tailored for multi-label music genre recognition with an extensive dataset.
Findings
Certain spectrogram scaling methods outperform others in genre classification accuracy
Preprocessing choices significantly impact model performance in multi-label tasks
The study provides insights into optimal approaches for genre recognition in complex musical datasets
Abstract
As the accessibility and ease-of-use of digital audio workstations increases, so does the quantity of music available to the average listener; additionally, differences between genres are not always well defined and can be abstract, with widely varying combinations of genres across individual records. In this article, multiple preprocessing methods and approaches to model training are described and compared, accounting for the eclectic nature of today's albums. A custom, manually labeled dataset of more than 18000 entries has been used to perform the experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
