Scalable Neural Contextual Bandit for Recommender Systems
Zheqing Zhu, Benjamin Van Roy

TL;DR
This paper introduces ENR, a scalable neural contextual bandit algorithm that improves recommendation quality and efficiency, significantly boosting user engagement while reducing computational costs in large-scale systems.
Contribution
The paper presents ENR, a novel epistemic neural network architecture enabling efficient Thompson sampling for large-scale recommender systems, addressing computational challenges of existing methods.
Findings
ENR increases click-through rates by at least 9%.
ENR improves user ratings by at least 6%.
ENR reduces required user interactions by at least 29%.
Abstract
High-quality recommender systems ought to deliver both innovative and relevant content through effective and exploratory interactions with users. Yet, supervised learning-based neural networks, which form the backbone of many existing recommender systems, only leverage recognized user interests, falling short when it comes to efficiently uncovering unknown user preferences. While there has been some progress with neural contextual bandit algorithms towards enabling online exploration through neural networks, their onerous computational demands hinder widespread adoption in real-world recommender systems. In this work, we propose a scalable sample-efficient neural contextual bandit algorithm for recommender systems. To do this, we design an epistemic neural network architecture, Epistemic Neural Recommendation (ENR), that enables Thompson sampling at a large scale. In two distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Data Stream Mining Techniques
