Stochastic Bandits for Egalitarian Assignment
Eugene Lim, Vincent Y. F. Tan, Harold Soh

TL;DR
This paper introduces EgalMAB, a stochastic multi-armed bandit problem focused on fair user-arm assignments, proposing a UCB-based solution and analyzing its regret bounds alongside an impossibility result.
Contribution
It formulates the EgalMAB problem, develops the EgalUCB policy, and provides theoretical regret bounds and impossibility results for fair assignment in stochastic bandits.
Findings
EgalUCB achieves sublinear regret bounds.
An impossibility result shows limitations of fair assignment policies.
The approach applies to fairness in resource allocation.
Abstract
We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent's objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Experimental Behavioral Economics Studies · Advanced Bandit Algorithms Research
MethodsSparse Evolutionary Training
