Contextual Bandits with Similarity Information

Aleksandrs Slivkins

arXiv:0907.3986·cs.DS·May 21, 2014·255 cites

Contextual Bandits with Similarity Information

Aleksandrs Slivkins

PDF

Open Access

TL;DR

This paper introduces adaptive algorithms for contextual bandits that leverage similarity information to improve decision-making efficiency, especially in large or infinite strategy spaces, by focusing on relevant contexts and high-payoff arms.

Contribution

It proposes novel adaptive partitioning algorithms that utilize similarity information more efficiently than uniform partitioning in contextual bandit problems.

Findings

01

Algorithms outperform uniform partitioning methods.

02

Adaptive partitions focus on popular contexts and high-payoff arms.

03

Improved efficiency in large or infinite strategy spaces.

Abstract

In a multi-armed bandit (MAB) problem, an online algorithm makes a sequence of choices. In each round it chooses from a time-invariant set of alternatives and receives the payoff associated with this alternative. While the case of small strategy sets is by now well-understood, a lot of recent work has focused on MAB problems with exponentially or infinitely large strategy sets, where one needs to assume extra structure in order to make the problem tractable. In particular, recent literature considered information on similarity between arms. We consider similarity information in the setting of "contextual bandits", a natural extension of the basic MAB problem where before each round an algorithm is given the "context" -- a hint about the payoffs in this round. Contextual bandits are directly motivated by placing advertisements on webpages, one of the crucial problems in sponsored…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics