Multi-agent Multi-armed Bandits with Minimum Reward Guarantee Fairness

Piyushi Manupriya; Himanshu; SakethaNath Jagarlapudi; Ganesh Ghalme

arXiv:2502.15240·cs.LG·June 23, 2025

Multi-agent Multi-armed Bandits with Minimum Reward Guarantee Fairness

Piyushi Manupriya, Himanshu, SakethaNath Jagarlapudi, Ganesh Ghalme

PDF

1 Repo

TL;DR

This paper introduces RewardFairUCB, an algorithm for multi-agent multi-armed bandits that balances maximizing social welfare with ensuring a minimum reward guarantee for fairness, achieving sublinear regret bounds.

Contribution

The paper proposes RewardFairUCB, a novel UCB-based algorithm that guarantees fairness while optimizing social welfare in multi-agent bandit settings, with proven regret bounds.

Findings

01

RewardFairUCB achieves $ ilde{O}(T^{1/2})$ social welfare regret.

02

RewardFairUCB attains $ ilde{O}(T^{3/4})$ fairness regret.

03

Lower bounds of $ ext{Ω}( oot{2} ext{T})$ for both regrets.

Abstract

We investigate the problem of maximizing social welfare while ensuring fairness in a multi-agent multi-armed bandit (MA-MAB) setting. In this problem, a centralized decision-maker takes actions over time, generating random rewards for various agents. Our goal is to maximize the sum of expected cumulative rewards, a.k.a. social welfare, while ensuring that each agent receives an expected reward that is at least a constant fraction of the maximum possible expected reward. Our proposed algorithm, RewardFairUCB, leverages the Upper Confidence Bound (UCB) technique to achieve sublinear regret bounds for both fairness and social welfare. The fairness regret measures the positive difference between the minimum reward guarantee and the expected reward of a given policy, whereas the social welfare regret measures the difference between the social welfare of the optimal fair policy and that of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Piyushi-0/Fair-MAMAB
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.