Learning Equilibria in Matching Markets from Bandit Feedback

Meena Jagadeesan; Alexander Wei; Yixin Wang; Michael I. Jordan; Jacob; Steinhardt

arXiv:2108.08843·cs.LG·February 2, 2023

Learning Equilibria in Matching Markets from Bandit Feedback

Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob, Steinhardt

PDF

1 Video

TL;DR

This paper introduces a novel framework and algorithms for learning stable market outcomes in large-scale matching markets with uncertainty, using bandit algorithms to balance stability and learning efficiency.

Contribution

It develops a new incentive-aware learning objective and applies bandit algorithms to the matching with transfers problem, providing near-optimal regret bounds.

Findings

01

Bandit algorithms can effectively learn market equilibria under uncertainty.

02

The proposed approach achieves near-optimal regret bounds.

03

Stability can be approximated in data-driven matching markets.

Abstract

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. Classical notions of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) are unfortunately of limited value in the learning setting, given that preferences are inherently uncertain and destabilizing while they are being learned. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. We design an incentive-aware learning objective that captures the distance of a market outcome from equilibrium. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning Equilibria in Matching Markets from Bandit Feedback· slideslive