Online Social Welfare Function-based Resource Allocation
Kanad Pardeshi, Samsara Foubert, Aarti Singh

TL;DR
This paper introduces a general framework for online resource allocation based on social welfare functions, providing confidence bounds, a near-optimal learning algorithm, and applications to inference tasks.
Contribution
It formalizes SWF-based online learning, develops confidence sequences for welfare bounds, and proposes SWF-UCB for near-optimal regret in resource allocation.
Findings
Confidence sequences valid for any monotonic, concave, Lipschitz SWF.
SWF-UCB achieves near-optimal regret of ten+b7b7b7T.
Experiments demonstrate b7b7 scaling and interactions between resources and SWF parameters.
Abstract
In many real-world settings, a centralized decision-maker must repeatedly allocate finite resources to a population over multiple time steps. Individuals who receive a resource derive some stochastic utility; to characterize the population-level effects of an allocation, the expected individual utilities are then aggregated using a social welfare function (SWF). We formalize this setting and present a general confidence sequence framework for SWF-based online learning and inference, valid for any monotonic, concave, and Lipschitz-continuous SWF. Our key insight is that monotonicity alone suffices to lift confidence sequences from individual utilities to anytime-valid bounds on optimal welfare. Building on this foundation, we propose SWF-UCB, a SWF-agnostic online learning algorithm that achieves near-optimal regret (for resources distributed among …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Age of Information Optimization
