# Heterogeneous Stochastic Interactions for Multiple Agents in a   Multi-armed Bandit Problem

**Authors:** Udari Madhushani, Naomi Ehrich Leonard

arXiv: 1905.08731 · 2019-05-22

## TL;DR

This paper studies a multi-agent multi-armed bandit problem where agents observe neighbors' choices and rewards in a stochastic network, proposing algorithms with performance bounds based on network structure and agent sociability.

## Contribution

It introduces a novel multi-agent bandit model with heterogeneous stochastic interactions and provides algorithms with theoretical performance guarantees.

## Key findings

- Performance bounds depend on network structure and agent sociability.
- Predicted agent ranking aligns with actual performance both analytically and computationally.
- The model captures complex social interactions influencing decision-making.

## Abstract

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors. Neighbors are defined by a network graph with heterogeneous and stochastic interconnections. These interactions are determined by the sociability of each agent, which corresponds to the probability that the agent observes its neighbors. We design an algorithm for each agent to maximize its own expected cumulative reward and prove performance bounds that depend on the sociability of the agents and the network structure. We use the bounds to predict the rank ordering of agents according to their performance and verify the accuracy analytically and computationally.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.08731/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.08731/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1905.08731/full.md

---
Source: https://tomesphere.com/paper/1905.08731