Online Generalized-mean Welfare Maximization: Achieving Near-Optimal Regret from Samples
Zongjun Yang, Rachitesh Kumar, Christian Kroer

TL;DR
This paper introduces online algorithms for fair allocation that maximize generalized-mean welfare, achieving near-optimal regret rates without prior distribution knowledge, even under non-stationary conditions.
Contribution
It presents a distribution-free, sample-efficient online allocation method that attains optimal regret rates in both i.i.d. and non-stationary models.
Findings
Pure greedy algorithm achieves (1/T) regret in i.i.d. setting.
Single historical sample suffices for near-optimal regret in non-stationary models.
Algorithms are robust to distribution shifts affecting historical data.
Abstract
We study online fair allocation of sequentially arriving items among agents with heterogeneous preferences, with the objective of maximizing generalized-mean welfare, defined as the -mean of agents' time-averaged utilities, with . We first consider the i.i.d. arrival model and show that the pure greedy algorithm -- which myopically chooses the welfare-maximizing integral allocation -- achieves average regret. Importantly, in contrast to prior work, our algorithm does not require distributional knowledge and achieves the optimal regret rate using only the online samples. We then go beyond i.i.d. arrivals and investigate a nonstationary model with time-varying independent distributions. In the absence of additional data about the distributions, it is known that every online algorithm must suffer average regret. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Game Theory and Applications
