Multi-Agent Thompson Sampling for Bandit Applications with Sparse   Neighbourhood Structures

Timothy Verstraeten; Eugenio Bargiacchi; Pieter JK Libin; Jan; Helsen; Diederik M Roijers; Ann Now\'e

arXiv:1911.10120·cs.LG·June 25, 2020

Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures

Timothy Verstraeten, Eugenio Bargiacchi, Pieter JK Libin, Jan, Helsen, Diederik M Roijers, Ann Now\'e

PDF

1 Repo

TL;DR

This paper introduces MATS, a Bayesian algorithm for multi-agent bandit problems with sparse interactions, improving coordination efficiency and outperforming existing methods in synthetic and real-world wind farm scenarios.

Contribution

We propose MATS, a novel Bayesian exploration-exploitation algorithm tailored for loosely-coupled multi-agent bandit problems, with theoretical regret bounds and superior empirical performance.

Findings

01

MATS achieves sublinear regret bounds in sparse multi-agent settings.

02

MATS outperforms the state-of-the-art algorithm MAUCE on benchmarks.

03

Application to wind farm control demonstrates practical benefits of MATS.

Abstract

Multi-agent coordination is prevalent in many real-world applications. However, such coordination is challenging due to its combinatorial nature. An important observation in this regard is that agents in the real world often only directly affect a limited set of neighbouring agents. Leveraging such loose couplings among agents is key to making coordination in multi-agent systems feasible. In this work, we focus on learning to coordinate. Specifically, we consider the multi-agent multi-armed bandit framework, in which fully cooperative loosely-coupled agents must learn to coordinate their decisions to optimize a common objective. We propose multi-agent Thompson sampling (MATS), a new Bayesian exploration-exploitation algorithm that leverages loose couplings. We provide a regret bound that is sublinear in time and low-order polynomial in the highest number of actions of a single agent for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

timo-verstraeten/mats-experiments
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.