The Sampling Complexity of Condorcet Winner Identification in Dueling Bandits
El Mehdi Saad (CC-UM6P-Rabat), Victor Thuot (MISTEA), Nicolas Verzelen (MISTEA)

TL;DR
This paper introduces a new method for identifying the Condorcet winner in stochastic dueling bandits that leverages the full gap matrix, providing improved sample complexity guarantees and establishing fundamental lower bounds.
Contribution
It proposes a novel identification procedure exploiting the entire gap matrix and derives tight, instance-dependent sample complexity bounds with matching lower bounds for the problem.
Findings
Improved sample complexity bounds over previous methods.
First lower bounds for Condorcet winner identification in dueling bandits.
Revealed new regimes and trade-offs in sample complexity.
Abstract
We study best-arm identification in stochastic dueling bandits under the sole assumption that a Condorcet winner exists, i.e., an arm that wins each noisy pairwise comparison with probability at least . We introduce a new identification procedure that exploits the full gap matrix (where is the probability that arm beats arm ), rather than only the gaps between the Condorcet winner and the other arms. We derive high-probability, instance-dependent sample-complexity guarantees that (up to logarithmic factors) improve the best known ones by leveraging informative comparisons beyond those involving the winner. We complement these results with new lower bounds which, to our knowledge, are the first for Condorcet-winner identification in stochastic dueling bandits. Our lower-bound analysis isolates the intrinsic cost of locating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Mobile Crowdsensing and Crowdsourcing
