Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits
Tianyi Xu, Jiaxin Liu, Nicholas Mattei, Zizhan Zheng

TL;DR
This paper introduces a novel framework for fair multi-agent multi-armed bandits that uses strategic probing to improve fairness and efficiency, with proven theoretical guarantees and strong empirical results.
Contribution
It presents a new probing-based approach for fairness in MA-MABs, including algorithms with theoretical performance bounds and practical effectiveness.
Findings
Outperforms baseline methods in fairness and efficiency
Achieves sublinear regret in online settings
Provides provable guarantees in offline scenarios
Abstract
We propose a multi-agent multi-armed bandit (MA-MAB) framework aimed at ensuring fair outcomes across agents while maximizing overall system performance. A key challenge in this setting is decision-making under limited information about arm rewards. To address this, we introduce a novel probing framework that strategically gathers information about selected arms before allocation. In the offline setting, where reward distributions are known, we leverage submodular properties to design a greedy probing algorithm with a provable performance bound. For the more complex online setting, we develop an algorithm that achieves sublinear regret while maintaining fairness. Extensive experiments on synthetic and real-world datasets show that our approach outperforms baseline methods, achieving better fairness and efficiency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Data Stream Mining Techniques
