When Can We Track Significant Preference Shifts in Dueling Bandits?

Joe Suk; Arpit Agarwal

arXiv:2302.06595·cs.LG·January 26, 2024

When Can We Track Significant Preference Shifts in Dueling Bandits?

Joe Suk, Arpit Agarwal

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the ability to detect significant shifts in user preferences over time within dueling bandits, revealing that such adaptive algorithms are feasible only under certain preference distribution classes.

Contribution

It provides the first analysis of dynamic regret bounds for dueling bandits with distribution shifts, identifying classes where such bounds are achievable or impossible.

Findings

01

Impossibility of $O( oot{K} ilde{L}T)$ regret under Condorcet and SST classes.

02

Feasibility of such regret bounds within the SST ∩ STI class.

03

Almost complete characterization of preference classes for adaptive dueling bandits.

Abstract

The $K$ -armed dueling bandits problem, where the feedback is in the form of noisy pairwise preferences, has been widely studied due its applications in information retrieval, recommendation systems, etc. Motivated by concerns that user preferences/tastes can evolve over time, we consider the problem of dueling bandits with distribution shifts. Specifically, we study the recent notion of significant shifts (Suk and Kpotufe, 2022), and ask whether one can design an adaptive algorithm for the dueling problem with $O (K \tilde{L} T)$ dynamic regret, where $\tilde{L}$ is the (unknown) number of significant shifts in preferences. We show that the answer to this question depends on the properties of underlying preference distributions. Firstly, we give an impossibility result that rules out any algorithm with $O (K \tilde{L} T)$ dynamic regret under the well-studied Condorcet and SST…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joesuk/nonstationary-duel
noneOfficial

Videos

When Can We Track Significant Preference Shifts in Dueling Bandits?· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Machine Learning and Algorithms