Budget-Constrained Bandits over General Cost and Reward Distributions

Semih Cayci; Atilla Eryilmaz; R. Srikant

arXiv:2003.00365·cs.LG·March 3, 2020·5 cites

Budget-Constrained Bandits over General Cost and Reward Distributions

Semih Cayci, Atilla Eryilmaz, R. Srikant

PDF

Open Access

TL;DR

This paper studies a complex bandit problem with random, potentially correlated costs and rewards under budget constraints, proposing algorithms that achieve near-optimal regret bounds in general and Gaussian cases.

Contribution

It introduces algorithms that exploit cost-reward correlation and provides tight regret bounds, including a lower bound, for budget-constrained bandits with general distributions.

Findings

01

Achieves $O(\log B)$ regret under certain moment conditions.

02

Proposes algorithms using linear MMSE estimation to exploit correlation.

03

Establishes tight regret bounds, optimal up to constants, for Gaussian cases.

Abstract

We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return. The objective is to maximize the total expected reward under a budget constraint on the total cost. The model is general in the sense that it allows correlated and potentially heavy-tailed cost-reward pairs that can take on negative values as required by many applications. We show that if moments of order $(2 + γ)$ for some $γ > 0$ exist for all cost-reward pairs, $O (lo g B)$ regret is achievable for a budget $B > 0$ . In order to achieve tight regret bounds, we propose algorithms that exploit the correlation between the cost and reward of each arm by extracting the common information via linear minimum mean-square error estimation. We prove a regret lower bound for this problem, and show that the proposed algorithms achieve tight problem-dependent regret…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics