Multiple-Play Bandits in the Position-Based Model

Paul Lagr\'ee (UP11; LRI); Claire Vernade (LTCI); Olivier Capp\'e; (LTCI)

arXiv:1606.02448·cs.LG·June 9, 2016·27 cites

Multiple-Play Bandits in the Position-Based Model

Paul Lagr\'ee (UP11, LRI), Claire Vernade (LTCI), Olivier Capp\'e, (LTCI)

PDF

Open Access

TL;DR

This paper addresses the challenge of learning optimal item placements in multi-position displays by leveraging position bias information under the Position-based click model, providing new theoretical bounds and efficient algorithms.

Contribution

It introduces a novel regret lower bound and efficient algorithms tailored for the Position-based click model in multiple-play bandit settings.

Findings

01

Derived a new regret lower bound for PBM.

02

Proposed algorithms show strong empirical performance.

03

Achieved theoretical guarantees in placement learning.

Abstract

Sequentially learning to place items in multi-position displays or lists is a task that can be cast into the multiple-play semi-bandit setting. However, a major concern in this context is when the system cannot decide whether the user feedback for each item is actually exploitable. Indeed, much of the content may have been simply ignored by the user. The present work proposes to exploit available information regarding the display position bias under the so-called Position-based click model (PBM). We first discuss how this model differs from the Cascade model and its variants considered in several recent works on multiple-play bandits. We then provide a novel regret lower bound for this model as well as computationally efficient algorithms that display good empirical and theoretical performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Consumer Market Behavior and Pricing · Optimization and Search Problems