Uncertainty of Joint Neural Contextual Bandit
Hongbo Guo, Zheqing Zhu

TL;DR
This paper investigates the uncertainty estimation in joint neural contextual bandits, providing theoretical insights and experimental validation to improve hyper-parameter tuning in large-scale recommendation systems.
Contribution
It offers a theoretical analysis of the uncertainty parameter in joint neural bandits and validates findings with real industrial data, aiding practical deployment.
Findings
Uncertainty $\sigma$ scales with $\sqrt{rac{F}{N}}$
The parameter $\alpha$ relates to model size and data volume
Experimental results confirm theoretical predictions
Abstract
Contextual bandit learning is increasingly favored in modern large-scale recommendation systems. To better utlize the contextual information and available user or item features, the integration of neural networks have been introduced to enhance contextual bandit learning and has triggered significant interest from both academia and industry. However, a major challenge arises when implementing a disjoint neural contextual bandit solution in large-scale recommendation systems, where each item or user may correspond to a separate bandit arm. The huge number of items to recommend poses a significant hurdle for real world production deployment. This paper focuses on a joint neural contextual bandit solution which serves all recommending items in one single model. The output consists of a predicted reward , an uncertainty and a hyper-parameter which balances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDecision-Making and Behavioral Economics
MethodsALIGN
