Learning by Repetition: Stochastic Multi-armed Bandits under Priming   Effect

Priyank Agrawal; Theja Tulabandhula

arXiv:2006.10356·cs.LG·June 19, 2020·1 cites

Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect

Priyank Agrawal, Theja Tulabandhula

PDF

Open Access

TL;DR

This paper investigates how persistence effects like wear-in and wear-out influence learning in stochastic multi-armed bandits, proposing algorithms that adapt to these effects and achieve sublinear regret.

Contribution

It introduces novel algorithms that account for priming effects in bandits, achieving regret bounds that incorporate wear-in and wear-out parameters.

Findings

01

Algorithms achieve sublinear regret considering priming effects.

02

Regret bounds are additive in priming parameters, matching classical algorithms without priming.

03

The approach extends modeling of time-varying rewards in bandit problems.

Abstract

We study the effect of persistence of engagement on learning in a stochastic multi-armed bandit setting. In advertising and recommendation systems, repetition effect includes a wear-in period, where the user's propensity to reward the platform via a click or purchase depends on how frequently they see the recommendation in the recent past. It also includes a counteracting wear-out period, where the user's propensity to respond positively is dampened if the recommendation was shown too many times recently. Priming effect can be naturally modelled as a temporal constraint on the strategy space, since the reward for the current action depends on historical actions taken by the platform. We provide novel algorithms that achieves sublinear regret in time and the relevant wear-in/wear-out parameters. The effect of priming on the regret upper bound is also additive, and we get back a guarantee…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Auction Theory and Applications