Dynamic Matching Bandit For Two-Sided Online Markets
Yuantong Li, Chi-hua Wang, Guang Cheng, Will Wei Sun

TL;DR
This paper introduces a dynamic matching bandit algorithm for two-sided online markets with evolving preferences based on contextual information, providing theoretical guarantees and practical robustness in various settings.
Contribution
It proposes a novel dynamic matching framework with an online preference estimation method and proves logarithmic regret bounds, advancing beyond static preference models.
Findings
Algorithm achieves high-probability agent-optimal stable matchings.
Logarithmic regret upper bound of O(log(T)).
Robust performance across diverse preference and context scenarios.
Abstract
Two-sided online matching platforms are employed in various markets. However, agents' preferences in the current market are usually implicit and unknown, thus needing to be learned from data. With the growing availability of dynamic side information involved in the decision process, modern online matching methodology demands the capability to track shifting preferences for agents based on contextual information. This motivates us to propose a novel framework for this dynamic online matching problem with contextual information, which allows for dynamic preferences in matching decisions. Existing works focus on online matching with static preferences, but this is insufficient: the two-sided preference changes as soon as one side's contextual information updates, resulting in non-static matching. In this paper, we propose a dynamic matching bandit algorithm to adapt to this problem. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Data Stream Mining Techniques
