Stochastic Bandits for Egalitarian Assignment

Eugene Lim; Vincent Y. F. Tan; Harold Soh

arXiv:2410.05856·stat.ML·October 10, 2024

Stochastic Bandits for Egalitarian Assignment

Eugene Lim, Vincent Y. F. Tan, Harold Soh

PDF

Open Access

TL;DR

This paper introduces EgalMAB, a stochastic multi-armed bandit problem focused on fair user-arm assignments, proposing a UCB-based solution and analyzing its regret bounds alongside an impossibility result.

Contribution

It formulates the EgalMAB problem, develops the EgalUCB policy, and provides theoretical regret bounds and impossibility results for fair assignment in stochastic bandits.

Findings

01

EgalUCB achieves sublinear regret bounds.

02

An impossibility result shows limitations of fair assignment policies.

03

The approach applies to fairness in resource allocation.

Abstract

We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent's objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Experimental Behavioral Economics Studies · Advanced Bandit Algorithms Research

MethodsSparse Evolutionary Training