Player-optimal Stable Regret for Bandit Learning in Matching Markets

Fang Kong; Shuai Li

arXiv:2307.10890·cs.LG·July 21, 2023

Player-optimal Stable Regret for Bandit Learning in Matching Markets

Fang Kong, Shuai Li

PDF

Open Access

TL;DR

This paper introduces an algorithm for bandit learning in matching markets that achieves near-optimal stable regret bounds for players, significantly improving over previous methods especially when preference gaps are small.

Contribution

The paper proposes the explore-then-Gale-Shapley (ETGS) algorithm that bounds player-optimal stable regret by O(K log T / Δ^2), advancing the theoretical understanding of learning in matching markets.

Findings

01

Achieves polynomial regret bounds for player-optimal stable matching.

02

Improves upon previous results with exponential bounds under small preference gaps.

03

Matches lower bounds under certain preference conditions.

Abstract

The problem of matching markets has been studied for a long time in the literature due to its wide range of applications. Finding a stable matching is a common equilibrium objective in this problem. Since market participants are usually uncertain of their preferences, a rich line of recent works study the online setting where one-side participants (players) learn their unknown preferences from iterative interactions with the other side (arms). Most previous works in this line are only able to derive theoretical guarantees for player-pessimal stable regret, which is defined compared with the players' least-preferred stable matching. However, under the pessimal stable matching, players only obtain the least reward among all stable matchings. To maximize players' profits, player-optimal stable matching would be the most desirable. Though \citet{basu21beyond} successfully bring an upper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Stochastic Gradient Optimization Techniques