Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation
By: Elena V. Epure , Yashar Deldjoo , Bruno Sguerra and more
Potential Business Impact:
Helps music apps pick songs you'll love.
Music Recommender Systems (MRS) have long relied on an information-retrieval framing, where progress is measured mainly through accuracy on retrieval-oriented subtasks. While effective, this reductionist paradigm struggles to address the deeper question of what makes a good recommendation, and attempts to broaden evaluation, through user studies or fairness analyses, have had limited impact. The emergence of Large Language Models (LLMs) disrupts this framework: LLMs are generative rather than ranking-based, making standard accuracy metrics questionable. They also introduce challenges such as hallucinations, knowledge cutoffs, non-determinism, and opaque training data, rendering traditional train/test protocols difficult to interpret. At the same time, LLMs create new opportunities, enabling natural-language interaction and even allowing models to act as evaluators. This work argues that the shift toward LLM-driven MRS requires rethinking evaluation. We first review how LLMs reshape user modeling, item modeling, and natural-language recommendation in music. We then examine evaluation practices from NLP, highlighting methodologies and open challenges relevant to MRS. Finally, we synthesize insights-focusing on how LLM prompting applies to MRS, to outline a structured set of success and risk dimensions. Our goal is to provide the MRS community with an updated, pedagogical, and cross-disciplinary perspective on evaluation.
Similar Papers
A Survey on Large Language Models in Multimodal Recommender Systems
Information Retrieval
Helps computers suggest movies and products better.
Enhance Large Language Models as Recommendation Systems with Collaborative Filtering
Information Retrieval
Makes computer suggestions better by learning from users.
LLM-Based Intelligent Agents for Music Recommendation: A Comparison with Classical Content-Based Filtering
Information Retrieval
Finds music you'll love, faster than before.