Collaborative Pure Exploration in Kernel Bandit

Yihan Du; Wei Chen; Yuko Kuroki; Longbo Huang

arXiv:2110.15771·cs.LG·March 17, 2023

Collaborative Pure Exploration in Kernel Bandit

Yihan Du, Wei Chen, Yuko Kuroki, Longbo Huang

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel multi-agent kernel bandit framework with algorithms that optimize decision-making under limited communication, providing theoretical guarantees and empirical validation for improved learning efficiency.

Contribution

The paper formulates the CoPE-KB model, designs optimal algorithms with kernelized estimators, and establishes matching bounds demonstrating their efficiency and optimality in multi-task decision making.

Findings

01

Algorithms achieve computation and communication efficiency.

02

Theoretical bounds quantify task similarity effects.

03

Empirical results validate theoretical claims and show superior performance.

Abstract

In this paper, we formulate a Collaborative Pure Exploration in Kernel Bandit problem (CoPE-KB), which provides a novel model for multi-agent multi-task decision making under limited communication and general reward functions, and is applicable to many online learning tasks, e.g., recommendation systems and network scheduling. We consider two settings of CoPE-KB, i.e., Fixed-Confidence (FC) and Fixed-Budget (FB), and design two optimal algorithms CoopKernelFC (for FC) and CoopKernelFB (for FB). Our algorithms are equipped with innovative and efficient kernelized estimators to simultaneously achieve computation and communication efficiency. Matching upper and lower bounds under both the statistical and communication metrics are established to demonstrate the optimality of our algorithms. The theoretical bounds successfully quantify the influences of task similarities on learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Collaborative Pure Exploration in Kernel Bandit· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques