Pure Exploration and Regret Minimization in Matching Bandits
Flore Sentenac, Jialin Yi, Cl\'ement Calauz\`enes, Vianney Perchet,, Milan Vojnovic

TL;DR
This paper introduces a novel approach to the matching bandits problem, leveraging a rank-1 assumption to significantly improve sample complexity and regret bounds, advancing the efficiency of algorithms in combinatorial matching tasks.
Contribution
It demonstrates how a rank-1 assumption on the adjacency matrix can reduce sample complexity and regret in matching bandits, a novel theoretical contribution.
Findings
Reduced sample complexity with rank-1 assumption
Lower regret bounds for matching bandits
Improved efficiency of algorithms in combinatorial matching
Abstract
Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Data Stream Mining Techniques
