Adaptive Policy Learning Under Unknown Network Interference
Aidan Gleich, Eric Laber, Alexander Volfovsky

TL;DR
This paper introduces a Thompson sampling algorithm that learns the interference network and optimizes treatment policies in adaptive experiments, improving outcomes and enabling causal analysis under unknown network interference.
Contribution
It develops a joint learning and optimization method for unknown network interference using Gibbs sampling, with theoretical regret bounds and empirical validation.
Findings
Achieves sublinear regret on real-world networks.
Reduces regret by over an order of magnitude compared to baselines.
Provides accurate downstream causal effect estimates.
Abstract
Adaptive experimentation under unknown network interference requires solving two coupled problems: (i) learning the underlying dynamics of interference among units and (ii) using these dynamics to inform treatment allocation in order to maximize a cumulative outcome of interest (e.g. revenue). Existing adaptive experimentation methods either assume the interference network is fully known or bypass the network by operating on coarse cluster-level randomizations. We develop a Thompson sampling algorithm that jointly learns the interference network and adaptively optimizes individual-level treatment allocations via a Gibbs sampler. The algorithm returns both an optimized treatment policy and an estimate of the interference network; the latter supports downstream causal analyses such as estimation of direct, indirect, and total treatment effects. For additive spillover models, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
