Bandit Learning in Matching Markets with Interviews

Amirmahdi Mirfakhar; Xuchuang Wang; Mengfan Xu; Hedyeh Beyhaghi; Mohammad Hajiesmaili

arXiv:2602.12224·cs.GT·February 13, 2026

Bandit Learning in Matching Markets with Interviews

Amirmahdi Mirfakhar, Xuchuang Wang, Mengfan Xu, Hedyeh Beyhaghi, Mohammad Hajiesmaili

PDF

Open Access

TL;DR

This paper introduces a bandit learning framework for two-sided matching markets with interviews, allowing firms and agents to learn preferences over time despite initial uncertainty, and proposes algorithms with strong regret guarantees.

Contribution

It extends existing matching market models by incorporating firm-side uncertainty and strategic deferral, enabling decentralized learning with improved regret bounds.

Findings

01

Algorithms achieve time-independent regret in various settings.

02

Decentralized performance approaches centralized benchmarks under mild conditions.

03

Significant improvement over previous $O(\log T)$ regret bounds.

Abstract

Two-sided matching markets rely on preferences from both sides, yet it is often impractical to evaluate preferences. Participants, therefore, conduct a limited number of interviews, which provide early, noisy impressions and shape final decisions. We study bandit learning in matching markets with interviews, modeling interviews as \textit{low-cost hints} that reveal partial preference information to both sides. Our framework departs from existing work by allowing firm-side uncertainty: firms, like agents, may be unsure of their own preferences and can make early hiring mistakes by hiring less preferred agents. To handle this, we extend the firm's action space to allow \emph{strategic deferral} (choosing not to hire in a round), enabling recovery from suboptimal hires and supporting decentralized learning without coordination. We design novel algorithms for (i) a centralized setting with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing · Game Theory and Voting Systems