Exploration in Interactive Personalized Music Recommendation: A   Reinforcement Learning Approach

Xinxi Wang; Yi Wang; David Hsu; Ye Wang

arXiv:1311.6355·cs.MM·November 26, 2013·25 cites

Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

Xinxi Wang, Yi Wang, David Hsu, Ye Wang

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning-based music recommendation system that balances exploring new songs and exploiting known preferences, using a Bayesian model for improved long-term user engagement.

Contribution

It formulates music recommendation as a multi-armed bandit problem with a Bayesian model, enabling unified recommendation and playlist generation.

Findings

01

Simulation results show improved recommendation quality.

02

User study indicates positive user engagement.

03

Model effectively balances exploration and exploitation.

Abstract

Current music recommender systems typically act in a greedy fashion by recommending songs with the highest user ratings. Greedy recommendation, however, is suboptimal over the long term: it does not actively gather information on user preferences and fails to recommend novel songs that are potentially interesting. A successful recommender system must balance the needs to explore user preferences and to exploit this information for recommendation. This paper presents a new approach to music recommendation by formulating this exploration-exploitation trade-off as a reinforcement learning task called the multi-armed bandit. To learn user preferences, it uses a Bayesian model, which accounts for both audio content and the novelty of recommendations. A piecewise-linear approximation to the model and a variational inference algorithm are employed to speed up Bayesian inference. One additional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Recommender Systems and Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings