Resourceful Contextual Bandits

Ashwinkumar Badanidiyuru; John Langford; Aleksandrs Slivkins

arXiv:1402.6779·cs.LG·August 3, 2015·37 cites

Resourceful Contextual Bandits

Ashwinkumar Badanidiyuru, John Langford, Aleksandrs Slivkins

PDF

Open Access

TL;DR

This paper introduces a novel algorithm for resource-constrained contextual bandits, effectively handling diverse resource constraints and outperforming simple reductions to non-contextual approaches.

Contribution

It presents the first algorithm capable of managing various resource constraints in contextual bandits with strong theoretical guarantees.

Findings

01

Achieves near-optimal regret bounds

02

Handles arbitrary policy sets and resource constraints

03

Improves over trivial non-contextual reductions

Abstract

We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI'11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS'13)), and prove a regret guarantee with near-optimal statistical properties.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications