A Contextual Bandit Bake-off

Alberto Bietti; Alekh Agarwal; John Langford

arXiv:1802.04064·stat.ML·June 8, 2021·53 cites

A Contextual Bandit Bake-off

Alberto Bietti, Alekh Agarwal, John Langford

PDF

Open Access 1 Repo

TL;DR

This paper empirically evaluates various contextual bandit algorithms using large supervised datasets, finding that optimism-based methods perform best overall, with simple greedy approaches and robust variants also showing strong results.

Contribution

It provides a comprehensive empirical comparison of contextual bandit algorithms, highlighting practical performance and robustness of recent methods and components.

Findings

01

Optimism under uncertainty method performs best overall.

02

Simple greedy baseline is a close second in performance.

03

Robust variants like Online Cover are effective and conservative.

Abstract

Contextual bandit algorithms are essential for solving many real-world interactive machine learning problems. Despite multiple recent successes on statistically and computationally efficient methods, the practical behavior of these algorithms is still poorly understood. We leverage the availability of large numbers of supervised learning datasets to empirically evaluate contextual bandit algorithms, focusing on practical methods that learn by relying on optimization oracles from supervised learning. We find that a recent method (Foster et al., 2018) using optimism under uncertainty works the best overall. A surprisingly close second is a simple greedy baseline that only explores implicitly through the diversity of contexts, followed by a variant of Online Cover (Agarwal et al., 2014) which tends to be more conservative but robust to problem specification by design. Along the way, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

albietz/cb_bakeoff
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Machine Learning and Algorithms