Online Budget Allocation with Censored Semi-Bandit Feedback

Fran\c{c}ois Bachoc; Nicol\`o Cesa-Bianchi; Tommaso Cesari; Roberto Colomboni

arXiv:2508.05844·cs.GT·February 5, 2026

Online Budget Allocation with Censored Semi-Bandit Feedback

Fran\c{c}ois Bachoc, Nicol\`o Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni

PDF

Open Access

TL;DR

This paper introduces an online budget allocation algorithm for tasks with censored semi-bandit feedback, achieving near-optimal regret bounds in diminishing-returns regimes and a worst-case bound of O(K\u221a T) for general cases.

Contribution

It proposes an optimism-based algorithm for censored semi-bandit feedback in budget allocation, with regret bounds that are polylogarithmic in T for diminishing returns and optimal O(K T) in the worst case.

Findings

01

Regret scales polylogarithmically with T in diminishing-returns regimes.

02

Achieves worst-case regret of O(K T) for general nondecreasing curves.

03

Establishes a matching lower bound of K T, even for full-feedback algorithms.

Abstract

We study a stochastic budget-allocation problem over $K$ tasks. At each round $t$ , the learner chooses an allocation $X_{t} \in Δ_{K}$ . Task $k$ succeeds with probability $F_{k} (X_{t, k})$ , where $F_{1}, \dots, F_{K}$ are nondecreasing budget-to-success curves, and upon success yields a random reward with unknown mean $μ_{k}$ . The learner observes which tasks succeed, and observes a task's reward only upon success (censored semi-bandit feedback). This model captures, for instance, splitting payments across crowdsourcing workers or distributing bids across simultaneous auctions, and subsumes stochastic multi-armed bandits and semi-bandits. We design an optimism-based algorithm that operates under censored semi-bandit feedback. Our main result shows that in diminishing-returns regimes, the regret of this algorithm scales polylogarithmically with the horizon $T$ without any ad hoc tuning. For…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Advanced Bandit Algorithms Research · Auction Theory and Applications