Evaluating Online Bandit Exploration In Large-Scale Recommender System
Hongbo Guo, Ruben Naeff, Alex Nikulkov, Zheqing Zhu

TL;DR
This paper presents a new testing framework for evaluating bandit algorithms in large-scale recommender systems, addressing fairness and data leakage issues, and demonstrates its effectiveness through extensive experiments.
Contribution
The paper introduces a novel test framework with new metrics for fair evaluation of bandit algorithms in production recommender systems.
Findings
The proposed framework enables fair evaluation of bandit algorithms.
Applying UCB in a large-scale system improves exploration efficiency.
Extensive experiments validate the effectiveness of the new evaluation method.
Abstract
Bandit learning has been an increasingly popular design choice for recommender system. Despite the strong interest in bandit learning from the community, there remains multiple bottlenecks that prevent many bandit learning approaches from productionalization. One major bottleneck is how to test the effectiveness of bandit algorithm with fairness and without data leakage. Different from supervised learning algorithms, bandit learning algorithms emphasize greatly on the data collection process through their explorative nature. Such explorative behavior may induce unfair evaluation in a classic A/B test setting. In this work, we apply upper confidence bound (UCB) to our large scale short video recommender system and present a test framework for the production bandit learning life-cycle with a new set of metrics. Extensive experiment results show that our experiment design is able to fairly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Smart Grid Energy Management
MethodsTest
