Neural Bandit with Arm Group Graph

Yunzhe Qi; Yikun Ban; Jingrui He

arXiv:2206.03644·cs.LG·June 13, 2022

Neural Bandit with Arm Group Graph

Yunzhe Qi, Yikun Ban, Jingrui He

PDF

TL;DR

This paper introduces a novel neural bandit model leveraging an Arm Group Graph and graph neural networks to better capture group correlations, achieving near-optimal regret bounds and superior experimental performance.

Contribution

The paper proposes AGG-UCB, a neural bandit algorithm utilizing an Arm Group Graph and GNNs, with theoretical regret bounds and extensive empirical validation.

Findings

01

AGG-UCB achieves near-optimal regret bounds.

02

The model effectively captures group correlations.

03

Experimental results outperform baselines.

Abstract

Contextual bandits aim to identify among a set of arms the optimal one with the highest reward based on their contextual information. Motivated by the fact that the arms usually exhibit group behaviors and the mutual impacts exist among groups, we introduce a new model, Arm Group Graph (AGG), where the nodes represent the groups of arms and the weighted edges formulate the correlations among groups. To leverage the rich information in AGG, we propose a bandit algorithm, AGG-UCB, where the neural networks are designed to estimate rewards, and we propose to utilize graph neural networks (GNN) to learn the representations of arm groups with correlations. To solve the exploitation-exploration dilemma in bandits, we derive a new upper confidence bound (UCB) built on neural networks (exploitation) for exploration. Furthermore, we prove that AGG-UCB can achieve a near-optimal regret bound with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.