Multi-Armed Bandits with Censored Consumption of Resources

Viktor Bengs; Eyke H\"ullermeier

arXiv:2011.00813·cs.LG·October 18, 2022

Multi-Armed Bandits with Censored Consumption of Resources

Viktor Bengs, Eyke H\"ullermeier

PDF

Open Access

TL;DR

This paper introduces a resource-aware multi-armed bandit model with censored rewards, proposing a UCB-inspired algorithm that balances reward maximization and resource minimization, supported by theoretical analysis and simulations.

Contribution

It formulates a novel bandit problem with resource constraints and censored feedback, and develops a new algorithm with proven regret bounds.

Findings

01

The proposed algorithm outperforms standard bandit algorithms in simulations.

02

Theoretical regret bounds are established for the new algorithm.

03

Resource-aware exploration improves reward realization under resource limits.

Abstract

We consider a resource-aware variant of the classical multi-armed bandit problem: In each round, the learner selects an arm and determines a resource limit. It then observes a corresponding (random) reward, provided the (random) amount of consumed resources remains below the limit. Otherwise, the observation is censored, i.e., no reward is obtained. For this problem setting, we introduce a measure of regret, which incorporates the actual amount of allocated resources of each learning round as well as the optimality of realizable rewards. Thus, to minimize regret, the learner needs to set a resource limit and choose an arm in such a way that the chance to realize a high reward within the predefined resource limit is high, while the resource limit itself should be kept as low as possible. We propose a UCB-inspired online learning algorithm, which we analyze theoretically in terms of its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems