Beyond $\log^2(T)$ Regret for Decentralized Bandits in Matching Markets

Soumya Basu; Karthik Abinav Sankararaman; Abishek Sankararaman

arXiv:2103.07501·cs.LG·March 16, 2021·5 cites

Beyond $\log^2(T)$ Regret for Decentralized Bandits in Matching Markets

Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman

PDF

Open Access 1 Video

TL;DR

This paper introduces decentralized algorithms for regret minimization in two-sided matching markets with bandit feedback, achieving near-optimal regret bounds and improving upon previous methods, especially in complex market settings.

Contribution

The paper presents new decentralized algorithms that reduce regret to near-logarithmic levels in general markets and achieve optimal regret in markets with stability conditions, surpassing prior work.

Findings

01

Achieved $O( ext{log}^{1+ ext{ε}}(T))$ regret for general markets.

02

Established $ ext{Θ}( ext{log}(T))$ regret in markets with stability.

03

Demonstrated algorithm superiority through simulations.

Abstract

We design decentralized algorithms for regret minimization in the two-sided matching market with one-sided bandit feedback that significantly improves upon the prior works (Liu et al. 2020a, 2020b, Sankararaman et al. 2020). First, for general markets, for any $ε > 0$ , we design an algorithm that achieves a $O (lo g^{1 + ε} (T))$ regret to the agent-optimal stable matching, with unknown time horizon $T$ , improving upon the $O (lo g^{2} (T))$ regret achieved in (Liu et al. 2020b). Second, we provide the optimal $Θ (lo g (T))$ agent-optimal regret for markets satisfying uniqueness consistency -- markets where leaving participants don't alter the original stable matching. Previously, $Θ (lo g (T))$ regret was achievable (Sankararaman et al. 2020, Liu et al. 2020b) in the much restricted serial dictatorship setting, when all arms have the same preference over the agents.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond $log^2(T)$ regret for decentralized bandits in matching markets· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Mobile Crowdsensing and Crowdsourcing