Bandit Max-Min Fair Allocation
Tsubasa Harada, Shinji Ito, Hanna Sumita

TL;DR
This paper introduces the bandit max-min fair allocation problem, proposing an algorithm with regret bounds and establishing lower bounds, addressing the challenge of allocating indivisible goods with semi-bandit feedback.
Contribution
It presents a novel algorithm for BMMFA with proven regret bounds and establishes fundamental lower bounds, advancing understanding of fair allocation under bandit feedback.
Findings
Proposed an algorithm with $O(mrac{ ext{sqrt}(T) ext{ln} T}{n} + m ext{sqrt}(T ext{ln}(mnT)))$ regret.
Established a regret lower bound of $ ext{Omega}(mrac{ ext{sqrt}(T)}{n})$.
Identified a logarithmic gap between upper and lower bounds when $T$ is large.
Abstract
In this paper, we study a new decision-making problem called the bandit max-min fair allocation (BMMFA) problem. The goal of this problem is to maximize the minimum utility among agents with additive valuations by repeatedly assigning indivisible goods to them. One key feature of this problem is that each agent's valuation for each item can only be observed through the semi-bandit feedback, while existing work supposes that the item values are provided at the beginning of each round. Another key feature is that the algorithm's reward function is not additive with respect to rounds, unlike most bandit-setting problems. Our first contribution is to propose an algorithm that has an asymptotic regret bound of , where is the number of agents, is the number of items, and is the time horizon. This is based on a novel combination of bandit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Advanced Bandit Algorithms Research · Auction Theory and Applications
