Multi-Agent Low-Dimensional Linear Bandits

Ronshee Chawla; Abishek Sankararaman; Sanjay Shakkottai

arXiv:2007.01442·cs.LG·May 26, 2022

Multi-Agent Low-Dimensional Linear Bandits

Ronshee Chawla, Abishek Sankararaman, Sanjay Shakkottai

PDF

Open Access

TL;DR

This paper introduces a decentralized multi-agent linear bandit algorithm leveraging side information about low-dimensional subspaces, significantly reducing regret through collaboration and communication.

Contribution

It proposes a novel decentralized algorithm that enables agents to communicate subspace information, improving regret bounds in multi-agent linear bandit problems with side information.

Findings

01

Per-agent regret is significantly reduced with communication.

02

The algorithm effectively distributes subspace search among agents.

03

Simulations confirm improved performance over non-communicative approaches.

Abstract

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector $θ^{*} \in R^{d}$ . The side information consists of a finite collection of low-dimensional subspaces, one of which contains $θ^{*}$ . In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected variant of LinUCB on the corresponding (low-dimensional) subspace. By distributing the search for the optimal subspace across users and learning of the unknown vector by each agent in the corresponding low-dimensional subspace, we show that the per-agent finite-time regret is much smaller than the case when agents do not communicate. We finally complement these results through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Game Theory and Applications