StreamingBandit; Experimenting with Bandit Policies
Jules Kruijswijk, Robin van Emden, Petri Parvinen, Maurits Kaptein

TL;DR
StreamingBandit is a Python web app that facilitates the development, testing, and comparison of bandit policies in real-time field experiments, simplifying sequential treatment allocation.
Contribution
It introduces StreamingBandit, a novel tool that enables easy implementation and evaluation of bandit policies in applied social science studies.
Findings
Supports real-time policy testing and comparison
Enables quick development and reuse of bandit policies
Facilitates sequential treatment allocation in field experiments
Abstract
A large number of statistical decision problems in the social sciences and beyond can be framed as a (contextual) multi-armed bandit problem. However, it is notoriously hard to develop and evaluate policies that tackle these types of problem, and to use such policies in applied studies. To address this issue, this paper introduces StreamingBandit, a Python web application for developing and testing bandit policies in field studies. StreamingBandit can sequentially select treatments using (online) policies in real time. Once StreamingBandit is implemented in an applied context, different policies can be tested, altered, nested, and compared. StreamingBandit makes it easy to apply a multitude of bandit policies for sequential allocation in field experiments, and allows for the quick development and re-use of novel policies. In this article, we detail the implementation logic of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Machine Learning and Data Classification
