TL;DR
This paper introduces a new efficient contextual bandit algorithm for personalized news recommendation, demonstrating significant click rate improvements using large-scale offline evaluation on real-world data.
Contribution
The paper presents a novel, computationally efficient contextual bandit algorithm and an offline evaluation method for personalized news recommendation systems.
Findings
Achieved a 12.5% increase in user clicks over standard algorithms.
Validated the approach on a dataset with over 33 million events.
Demonstrated greater advantages when data is limited.
Abstract
Personalized web services strive to adapt their services (advertisements, news articles, etc) to individual users by making use of both content and user information. Despite a few recent advances, this problem remains challenging for at least two reasons. First, web service is featured with dynamically changing pools of content, rendering traditional collaborative filtering methods inapplicable. Second, the scale of most web services of practical interest calls for solutions that are both fast in learning and computation. In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
