Multi-Player Approaches for Dueling Bandits
Or Raveh, Junya Honda, Masashi Sugiyama

TL;DR
This paper introduces new multiplayer algorithms for dueling bandits that improve exploration efficiency and outperform single-player benchmarks, addressing the unique challenges of distributed preference-based decision-making.
Contribution
It presents a black-box approach matching the lower bound and a distributed protocol with a novel Condorcet-winner recommendation, advancing multiplayer dueling bandit solutions.
Findings
Black-box approach matches theoretical lower bounds.
Distributed protocol accelerates exploration in many cases.
Multiplayer algorithms outperform single-player benchmarks.
Abstract
Various approaches have emerged for multi-armed bandits in distributed systems. The multiplayer dueling bandit problem, common in scenarios with only preference-based information like human feedback, introduces challenges related to controlling collaborative exploration of non-informative arm pairs, but has received little attention. To fill this gap, we demonstrate that the direct use of a Follow Your Leader black-box approach matches the lower bound for this setting when utilizing known dueling bandit algorithms as a foundation. Additionally, we analyze a message-passing fully distributed approach with a novel Condorcet-winner recommendation protocol, resulting in expedited exploration in many cases. Our experimental comparisons reveal that our multiplayer algorithms surpass single-player benchmark algorithms, underscoring their efficacy in addressing the nuanced challenges of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Smart Grid Energy Management
