Learning to Allocate Resources with Censored Feedback

Giovanni Montanari; C\^ome Fiegel; Corentin Pla; Aadirupa Saha; Vianney Perchet

arXiv:2602.06565·cs.LG·February 9, 2026

Learning to Allocate Resources with Censored Feedback

Giovanni Montanari, C\^ome Fiegel, Corentin Pla, Aadirupa Saha, Vianney Perchet

PDF

Open Access

TL;DR

None

Contribution

None

Abstract

We study the online resource allocation problem in which at each round, a budget $B$ must be allocated across $K$ arms under censored feedback. An arm yields a reward if and only if two conditions are satisfied: (i) the arm is activated according to an arm-specific Bernoulli random variable with unknown parameter, and (ii) the allocated budget exceeds a random threshold drawn from a parametric distribution with unknown parameter. Over $T$ rounds, the learner must jointly estimate the unknown parameters and allocate the budget so as to maximize cumulative reward facing the exploration--exploitation trade-off. We prove an information-theoretic regret lower bound $Ω (T^{1/3})$ , demonstrating the intrinsic difficulty of the problem. We then propose RA-UCB, an optimistic algorithm that leverages non-trivial parameter estimation and confidence bounds. When the budget $B$ is known at the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Age of Information Optimization