Identifying Copeland Winners in Dueling Bandits with Indifferences

Viktor Bengs; Bj\"orn Haddenhorst; Eyke H\"ullermeier

arXiv:2310.00750·cs.LG·October 3, 2023

Identifying Copeland Winners in Dueling Bandits with Indifferences

Viktor Bengs, Bj\"orn Haddenhorst, Eyke H\"ullermeier

PDF

Open Access

TL;DR

This paper studies the problem of identifying the best option in dueling bandits with possible indifference feedback, proposing a near-optimal algorithm with strong empirical results and improved bounds under certain conditions.

Contribution

It introduces POCOWISTA, a novel algorithm for Copeland winner identification in dueling bandits with indifferences, along with theoretical bounds and empirical validation.

Findings

01

Proposed POCOWISTA algorithm achieves near-optimal sample complexity.

02

Established lower bounds for sample complexity in this setting.

03

Enhanced version with improved bounds under stochastic transitivity.

Abstract

We consider the task of identifying the Copeland winner(s) in a dueling bandits problem with ternary feedback. This is an underexplored but practically relevant variant of the conventional dueling bandits problem, in which, in addition to strict preference between two arms, one may observe feedback in the form of an indifference. We provide a lower bound on the sample complexity for any learning algorithm finding the Copeland winner(s) with a fixed error probability. Moreover, we propose POCOWISTA, an algorithm with a sample complexity that almost matches this lower bound, and which shows excellent empirical performance, even for the conventional dueling bandits problem. For the case where the preference probabilities satisfy a specific type of stochastic transitivity, we provide a refined version with an improved worst case sample complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Advanced Bandit Algorithms Research · Data Stream Mining Techniques