Competing Bandits in Matching Markets via Super Stability
Soumya Basu

TL;DR
This paper introduces a novel approach to bandit learning in two-sided matching markets using super-stability, demonstrating improved algorithms and fundamental lower bounds for stable matchings under incomplete information.
Contribution
It extends the Gale-Shapley algorithm to handle two-sided reward uncertainty with super-stability, providing new regret bounds and a lower bound for binary stable regret.
Findings
Extended GS algorithm achieves logarithmic stable regret.
Decentralized adaptation incurs only a constant regret increase.
A new lower bound characterizes the complexity of stable matching with bandit feedback.
Abstract
We study bandit learning in matching markets with two-sided reward uncertainty, extending prior research primarily focused on single-sided uncertainty. Leveraging the concept of `super-stability' from Irving (1994), we demonstrate the advantage of the Extended Gale-Shapley (GS) algorithm over the standard GS algorithm in achieving true stable matchings under incomplete information. By employing the Extended GS algorithm, our centralized algorithm attains a logarithmic pessimal stable regret dependent on an instance-dependent admissible gap parameter. This algorithm is further adapted to a decentralized setting with a constant regret increase. Finally, we establish a novel centralized instance-dependent lower bound for binary stable regret, elucidating the roles of the admissible gap and super-stable matching in characterizing the complexity of stable matching with bandit feedback.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Game Theory and Applications
