Improved Regret Bounds for Online Fair Division with Bandit Learning

Benjamin Schiffer; Shirley Zhang

arXiv:2501.07022·cs.GT·January 14, 2025

Improved Regret Bounds for Online Fair Division with Bandit Learning

Benjamin Schiffer, Shirley Zhang

PDF

1 Video

TL;DR

This paper introduces a UCB algorithm for online fair division with bandit learning, achieving near-optimal regret bounds while satisfying proportionality constraints in expectation.

Contribution

It presents the first UCB-based method for online fair division with unknown values, improving regret bounds from O(T^{2/3}) to O(\u221a{T}).

Findings

01

Achieves O((T)) regret with high probability.

02

Guarantees proportionality in expectation under unknown value distributions.

03

Introduces a two-round linear optimization UCB algorithm for this setting.

Abstract

We study online fair division when there are a finite number of item types and the player values for the items are drawn randomly from distributions with unknown means. In this setting, a sequence of indivisible items arrives according to a random online process, and each item must be allocated to a single player. The goal is to maximize expected social welfare while maintaining that the allocation satisfies proportionality in expectation. When player values are normalized, we show that it is possible to with high probability guarantee proportionality constraint satisfaction and achieve $\tilde{O} (T)$ regret. To achieve this result, we present an upper confidence bound (UCB) algorithm that uses two rounds of linear optimization. This algorithm highlights fundamental aspects of proportionality constraints that allow for a UCB algorithm despite the presence of many (potentially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improved Regret Bounds for Online Fair Division with Bandit Learning· underline