Online Budget Allocation with Censored Semi-Bandit Feedback
Fran\c{c}ois Bachoc, Nicol\`o Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni

TL;DR
This paper introduces an online budget allocation algorithm for tasks with censored semi-bandit feedback, achieving near-optimal regret bounds in diminishing-returns regimes and a worst-case bound of O(K\u221a T) for general cases.
Contribution
It proposes an optimism-based algorithm for censored semi-bandit feedback in budget allocation, with regret bounds that are polylogarithmic in T for diminishing returns and optimal O(K T) in the worst case.
Findings
Regret scales polylogarithmically with T in diminishing-returns regimes.
Achieves worst-case regret of O(K T) for general nondecreasing curves.
Establishes a matching lower bound of K T, even for full-feedback algorithms.
Abstract
We study a stochastic budget-allocation problem over tasks. At each round , the learner chooses an allocation . Task succeeds with probability , where are nondecreasing budget-to-success curves, and upon success yields a random reward with unknown mean . The learner observes which tasks succeed, and observes a task's reward only upon success (censored semi-bandit feedback). This model captures, for instance, splitting payments across crowdsourcing workers or distributing bids across simultaneous auctions, and subsumes stochastic multi-armed bandits and semi-bandits. We design an optimism-based algorithm that operates under censored semi-bandit feedback. Our main result shows that in diminishing-returns regimes, the regret of this algorithm scales polylogarithmically with the horizon without any ad hoc tuning. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Advanced Bandit Algorithms Research · Auction Theory and Applications
