Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach
Xinxi Wang, Yi Wang, David Hsu, Ye Wang

TL;DR
This paper introduces a reinforcement learning-based music recommendation system that balances exploring new songs and exploiting known preferences, using a Bayesian model for improved long-term user engagement.
Contribution
It formulates music recommendation as a multi-armed bandit problem with a Bayesian model, enabling unified recommendation and playlist generation.
Findings
Simulation results show improved recommendation quality.
User study indicates positive user engagement.
Model effectively balances exploration and exploitation.
Abstract
Current music recommender systems typically act in a greedy fashion by recommending songs with the highest user ratings. Greedy recommendation, however, is suboptimal over the long term: it does not actively gather information on user preferences and fails to recommend novel songs that are potentially interesting. A successful recommender system must balance the needs to explore user preferences and to exploit this information for recommendation. This paper presents a new approach to music recommendation by formulating this exploration-exploitation trade-off as a reinforcement learning task called the multi-armed bandit. To learn user preferences, it uses a Bayesian model, which accounts for both audio content and the novelty of recommendations. A piecewise-linear approximation to the model and a variational inference algorithm are employed to speed up Bayesian inference. One additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Recommender Systems and Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
