Bernoulli Rank-$1$ Bandits for Click Feedback

Sumeet Katariya; Branislav Kveton; Csaba Szepesv\'ari; Claire Vernade,; Zheng Wen

arXiv:1703.06513·cs.LG·March 21, 2017·6 cites

Bernoulli Rank-$1$ Bandits for Click Feedback

Sumeet Katariya, Branislav Kveton, Csaba Szepesv\'ari, Claire Vernade,, Zheng Wen

PDF

Open Access

TL;DR

This paper introduces Rank1ElimKL, an improved algorithm for Bernoulli rank-1 bandits that maintains competitive regret bounds regardless of the minimum reward gap, outperforming previous methods especially when rewards are small.

Contribution

The paper proposes Rank1ElimKL, replacing confidence intervals with KL-based ones, ensuring competitiveness across all reward distributions, including those with very small minimum rewards.

Findings

01

Rank1ElimKL outperforms Rank1Elim on benign instances.

02

Experimental results show significant improvements across synthetic and real data.

03

Rank1ElimKL remains competitive regardless of the minimum reward gap.

Abstract

The probability that a user will click a search result depends both on its relevance and its position on the results page. The position based model explains this behavior by ascribing to every item an attraction probability, and to every position an examination probability. To be clicked, a result must be both attractive and examined. The probabilities of an item-position pair being clicked thus form the entries of a rank- $1$ matrix. We propose the learning problem of a Bernoulli rank- $1$ bandit where at each step, the learning agent chooses a pair of row and column arms, and receives the product of their Bernoulli-distributed values as a reward. This is a special case of the stochastic rank- $1$ bandit problem considered in recent work that proposed an elimination based algorithm Rank1Elim, and showed that Rank1Elim's regret scales linearly with the number of rows and columns on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Auction Theory and Applications