Uncertainty of Joint Neural Contextual Bandit

Hongbo Guo; Zheqing Zhu

arXiv:2406.02515·cs.LG·June 5, 2024

Uncertainty of Joint Neural Contextual Bandit

Hongbo Guo, Zheqing Zhu

PDF

Open Access

TL;DR

This paper investigates the uncertainty estimation in joint neural contextual bandits, providing theoretical insights and experimental validation to improve hyper-parameter tuning in large-scale recommendation systems.

Contribution

It offers a theoretical analysis of the uncertainty parameter in joint neural bandits and validates findings with real industrial data, aiding practical deployment.

Findings

01

Uncertainty $\sigma$ scales with $\sqrt{rac{F}{N}}$

02

The parameter $\alpha$ relates to model size and data volume

03

Experimental results confirm theoretical predictions

Abstract

Contextual bandit learning is increasingly favored in modern large-scale recommendation systems. To better utlize the contextual information and available user or item features, the integration of neural networks have been introduced to enhance contextual bandit learning and has triggered significant interest from both academia and industry. However, a major challenge arises when implementing a disjoint neural contextual bandit solution in large-scale recommendation systems, where each item or user may correspond to a separate bandit arm. The huge number of items to recommend poses a significant hurdle for real world production deployment. This paper focuses on a joint neural contextual bandit solution which serves all recommending items in one single model. The output consists of a predicted reward $μ$ , an uncertainty $σ$ and a hyper-parameter $α$ which balances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDecision-Making and Behavioral Economics

MethodsALIGN