Neural Bandit with Arm Group Graph
Yunzhe Qi, Yikun Ban, Jingrui He

TL;DR
This paper introduces a novel neural bandit model leveraging an Arm Group Graph and graph neural networks to better capture group correlations, achieving near-optimal regret bounds and superior experimental performance.
Contribution
The paper proposes AGG-UCB, a neural bandit algorithm utilizing an Arm Group Graph and GNNs, with theoretical regret bounds and extensive empirical validation.
Findings
AGG-UCB achieves near-optimal regret bounds.
The model effectively captures group correlations.
Experimental results outperform baselines.
Abstract
Contextual bandits aim to identify among a set of arms the optimal one with the highest reward based on their contextual information. Motivated by the fact that the arms usually exhibit group behaviors and the mutual impacts exist among groups, we introduce a new model, Arm Group Graph (AGG), where the nodes represent the groups of arms and the weighted edges formulate the correlations among groups. To leverage the rich information in AGG, we propose a bandit algorithm, AGG-UCB, where the neural networks are designed to estimate rewards, and we propose to utilize graph neural networks (GNN) to learn the representations of arm groups with correlations. To solve the exploitation-exploration dilemma in bandits, we derive a new upper confidence bound (UCB) built on neural networks (exploitation) for exploration. Furthermore, we prove that AGG-UCB can achieve a near-optimal regret bound with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
