Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation
Elena V. Epure, Yashar Deldjoo, Bruno Sguerra, Markus Schedl, Manuel Moussallam

TL;DR
This paper explores how large language models transform music recommendation systems, emphasizing new evaluation challenges and opportunities for more natural, flexible, and insightful user interactions.
Contribution
It provides a comprehensive review of LLM impacts on music recommendation, proposing new evaluation frameworks and highlighting opportunities and risks for future research.
Findings
LLMs enable natural-language interaction in music recommendation.
Traditional accuracy metrics are insufficient for LLM-based MRS.
Evaluation practices from NLP can inform MRS assessment.
Abstract
Music Recommender Systems (MRS) have long relied on an information-retrieval framing, where progress is measured mainly through accuracy on retrieval-oriented subtasks. While effective, this reductionist paradigm struggles to address the deeper question of what makes a good recommendation, and attempts to broaden evaluation, through user studies or fairness analyses, have had limited impact. The emergence of Large Language Models (LLMs) disrupts this framework: LLMs are generative rather than ranking-based, making standard accuracy metrics questionable. They also introduce challenges such as hallucinations, knowledge cutoffs, non-determinism, and opaque training data, rendering traditional train/test protocols difficult to interpret. At the same time, LLMs create new opportunities, enabling natural-language interaction and even allowing models to act as evaluators. This work argues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Music and Audio Processing · Topic Modeling
