TL;DR
This paper evaluates six pretrained audio models for music recommendation systems, revealing significant variability in their effectiveness and highlighting the need for task-specific adaptation in MIR applications.
Contribution
It provides a comparative analysis of pretrained audio representations in the context of music recommender systems, an area previously underexplored.
Findings
Pretrained models show varied performance in MRS tasks.
Traditional MIR models may not directly transfer to recommendation tasks.
The study establishes a baseline for future research in pretrained audio representations for MRS.
Abstract
Over the years, Music Information Retrieval (MIR) has proposed various models pretrained on large amounts of music data. Transfer learning showcases the proven effectiveness of pretrained backend models with a broad spectrum of downstream tasks, including auto-tagging and genre classification. However, MIR papers generally do not explore the efficiency of pretrained models for Music Recommender Systems (MRS). In addition, the Recommender Systems community tends to favour traditional end-to-end neural network learning over these models. Our research addresses this gap and evaluates the applicability of six pretrained backend models (MusicFM, Music2Vec, MERT, EncodecMAE, Jukebox, and MusiCNN) in the context of MRS. We assess their performance using three recommendation models: K-nearest neighbours (KNN), shallow neural network, and BERT4Rec. Our findings suggest that pretrained audio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDense Connections · Convolution · Dilated Convolution · VQ-VAE · Layer Normalization · Position-Wise Feed-Forward Layer · Residual Connection · Jukebox
