Multi-Armed Bandits with Network Interference
Abhineet Agarwal, Anish Agarwal, Lorenzo Masoero, and Justin, Whitehouse

TL;DR
This paper introduces a novel multi-armed bandit approach for online experiments with network interference, using sparse linear models to effectively minimize regret in complex, interconnected environments.
Contribution
It develops a new framework for adaptive treatment assignment under network interference using Fourier analysis and linear regression, handling unknown interference neighborhoods.
Findings
Algorithms achieve low regret in simulations.
Effective in both known and unknown interference neighborhoods.
Generalizes previous models by relaxing interference assumptions.
Abstract
Online experimentation with interference is a common challenge in modern applications such as e-commerce and adaptive clinical trials in medicine. For example, in online marketplaces, the revenue of a good depends on discounts applied to competing goods. Statistical inference with interference is widely studied in the offline setting, but far less is known about how to adaptively assign treatments to minimize regret. We address this gap by studying a multi-armed bandit (MAB) problem where a learner (e-commerce platform) sequentially assigns one of possible actions (discounts) to units (goods) over rounds to minimize regret (maximize revenue). Unlike traditional MAB problems, the reward of each unit depends on the treatments assigned to other units, i.e., there is interference across the underlying network of units. With actions and units,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Smart Grid Energy Management
