Semantic IDs for Music Recommendation

M. Jeffrey Mei; Florian Henkel; Samuel E. Sandberg; Oliver Bembom; Andreas F. Ehmann

arXiv:2507.18800·cs.IR·July 28, 2025

Semantic IDs for Music Recommendation

M. Jeffrey Mei, Florian Henkel, Samuel E. Sandberg, Oliver Bembom, Andreas F. Ehmann

PDF

TL;DR

This paper introduces semantic IDs, a shared content-based embedding approach, to enhance music recommendation accuracy and diversity while reducing model size, demonstrated through experiments and an online A/B test.

Contribution

It proposes semantic IDs as a novel shared embedding method that improves recommendation performance and reduces model complexity in music systems.

Findings

01

Semantic IDs improve recommendation accuracy

02

They increase diversity in recommendations

03

Model size is significantly reduced

Abstract

Training recommender systems for next-item recommendation often requires unique embeddings to be learned for each item, which may take up most of the trainable parameters for a model. Shared embeddings, such as using content information, can reduce the number of distinct embeddings to be stored in memory. This allows for a more lightweight model; correspondingly, model complexity can be increased due to having fewer embeddings to store in memory. We show the benefit of using shared content-based features ('semantic IDs') in improving recommendation accuracy and diversity, while reducing model size, for two music recommendation datasets, including an online A/B test on a music streaming service.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.