Diversity-Preserving K-Armed Bandits, Revisited

H\'edi Hadiji (L2S); S\'ebastien Gerchinovitz (IMT); Jean-Michel; Loubes (IMT); Gilles Stoltz (CELESTE; LMO)

arXiv:2010.01874·stat.ML·July 25, 2024

Diversity-Preserving K-Armed Bandits, Revisited

H\'edi Hadiji (L2S), S\'ebastien Gerchinovitz (IMT), Jean-Michel, Loubes (IMT), Gilles Stoltz (CELESTE, LMO)

PDF

Open Access

TL;DR

This paper revisits diversity-preserving multi-armed bandit algorithms, introducing a UCB approach that achieves bounded regret under certain conditions and establishing regret lower bounds in others.

Contribution

It designs a new UCB algorithm tailored to diversity-preserving bandits and analyzes its regret, extending previous work to more general settings.

Findings

01

The UCB algorithm achieves bounded regret when diversity is maintained.

02

Regret lower bounds show logarithmic regret in mean-unbounded models.

03

Discussion of extensions beyond polytopal action spaces.

Abstract

We consider the bandit-based framework for diversity-preserving recommendations introduced by Celis et al. (2019), who approached it in the case of a polytope mainly by a reduction to the setting of linear bandits. We design a UCB algorithm using the specific structure of the setting and show that it enjoys a bounded distribution-dependent regret in the natural cases when the optimal mixed actions put some probability mass on all actions (i.e., when diversity is desirable). The regret lower bounds provided show that otherwise, at least when the model is mean-unbounded, a $ln T$ regret is suffered. We also discuss an example beyond the special case of polytopes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Auction Theory and Applications