TL;DR
This paper introduces a contextual bandit framework for personalized carousel recommendations in music streaming apps, effectively handling large item sets, user preferences, and delayed feedback, validated through real-world large-scale experiments.
Contribution
It presents a novel multi-armed bandit approach tailored for carousel personalization, incorporating cascade updates and delayed feedback, with open-source tools and data for further research.
Findings
Effective at modeling real-world carousel characteristics
Improved playlist recommendation performance
Public release of experimental data and simulation environment
Abstract
Media services providers, such as music streaming platforms, frequently leverage swipeable carousels to recommend personalized content to their users. However, selecting the most relevant items (albums, artists, playlists...) to display in these carousels is a challenging task, as items are numerous and as users have different preferences. In this paper, we model carousel personalization as a contextual multi-armed bandit problem with multiple plays, cascade-based updates and delayed batch feedback. We empirically show the effectiveness of our framework at capturing characteristics of real-world carousels by addressing a large-scale playlist recommendation task on a global music streaming mobile app. Along with this paper, we publicly release industrial data from our experiments, as well as an open-source environment to simulate comparable carousel personalization learning problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
