Random Projections of Mel-Spectrograms as Low-Level Features for Automatic Music Genre Classification
Juliano Henrique Foleiss, Tiago Fernandes Tavares

TL;DR
This paper demonstrates that random projections of Mel-spectrograms are effective low-level features for music genre classification, offering a computationally efficient alternative to learned features and transfer learning in shallow learning scenarios.
Contribution
It introduces the use of random projections of Mel-spectrograms as a simple, low-cost feature extraction method for music genre classification, comparable to more complex learned features.
Findings
Random projections perform comparably to auto-encoder learned features.
They outperform transfer learning features in shallow learning scenarios.
They require less computational power and domain knowledge.
Abstract
In this work, we analyse the random projections of Mel-spectrograms as low-level features for music genre classification. This approach was compared to handcrafted features, features learned using an auto-encoder and features obtained from a transfer learning setting. Tests in five different well-known, publicly available datasets show that random projections leads to results comparable to learned features and outperforms features obtained via transfer learning in a shallow learning scenario. Random projections do not require using extensive specialist knowledge and, simultaneously, requires less computational power for training than other projection-based low-level features. Therefore, they can be are a viable choice for usage in shallow learning content-based music genre classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
