Decoy Bandits Dueling on a Poset

Julien Audiffren (CMLA); Ralaivola Liva (LIF)

arXiv:1602.02706·cs.LG·June 10, 2016

Decoy Bandits Dueling on a Poset

Julien Audiffren (CMLA), Ralaivola Liva (LIF)

PDF

Open Access

TL;DR

This paper introduces two algorithms, UnchainedBandits and SlicingBandits, for efficiently identifying optimal arms in dueling bandits defined on partially ordered sets, with theoretical guarantees and experimental validation.

Contribution

It presents novel algorithms for dueling bandits on posets, handling incomparable arms and leveraging decoy concepts, with improved performance and minimal assumptions.

Findings

01

UnchainedBandits effectively finds optimal arms in posets.

02

SlicingBandits outperforms UnchainedBandits when incomparability info is available.

03

Both algorithms have proven theoretical guarantees and experimental success.

Abstract

We adress the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms of any poset even when pairs of comparable arms cannot be distinguished from pairs of incomparable arms, with a set of minimal assumptions. This algorithm relies on the concept of decoys, which stems from social psychology. For the easier case where the incomparability information may be accessible, we propose a second algorithm, SlicingBandits, which takes advantage of this information and achieves a very significant gain of performance compared to UnchainedBandits. We provide theoretical guarantees and experimental evaluation for both algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing · Machine Learning and Algorithms