Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu,, Christina Lee Yu

TL;DR
ArtificialReplay is a meta-algorithm that efficiently incorporates historical data into bandit algorithms, improving data efficiency and regret performance, especially in real-world scenarios with mixed offline and online data.
Contribution
We introduce ArtificialReplay, a novel meta-algorithm that effectively uses less historical data while maintaining regret guarantees for a broad class of bandit algorithms.
Findings
ArtificialReplay achieves similar regret with less historical data.
It improves data efficiency in real-world bandit applications.
Experimental results confirm practical benefits across different bandit types.
Abstract
Most real-world deployments of bandit algorithms exist somewhere in between the offline and online set-up, where some historical data is available upfront and additional data is collected dynamically online. How best to incorporate historical data to "warm start" bandit algorithms is an open question: naively initializing reward estimates using all historical samples can suffer from spurious data and imbalanced data coverage, leading to data inefficiency (amount of historical data used) - particularly for continuous action spaces. To address these challenges, we propose ArtificialReplay, a meta-algorithm for incorporating historical data into any arbitrary base bandit algorithm. We show that ArtificialReplay uses only a fraction of the historical data compared to a full warm-start approach, while still achieving identical regret for base algorithms that satisfy independence of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing
MethodsBalanced Selection
