Contextual Bandit with Herding Effects: Algorithms and Recommendation   Applications

Luyue Xu; Liming Wang; Hong Xie; Mingqiang Zhou

arXiv:2408.14432·cs.LG·August 29, 2024

Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications

Luyue Xu, Liming Wang, Hong Xie, Mingqiang Zhou

PDF

Open Access

TL;DR

This paper introduces TS-Conf, a novel contextual bandit algorithm designed to mitigate herding effects in user feedback, leading to more accurate recommendations and faster learning in biased feedback environments.

Contribution

It formulates a user feedback model for herding effects and develops TS-Conf, the first algorithm tailored to address feedback bias caused by herding in recommendation systems.

Findings

01

TS-Conf outperforms four benchmark algorithms in experiments.

02

The regret bound reveals herding effects slow down learning.

03

TS-Conf effectively reduces feedback bias impact.

Abstract

Contextual bandits serve as a fundamental algorithmic framework for optimizing recommendation decisions online. Though extensive attention has been paid to tailoring contextual bandits for recommendation applications, the "herding effects" in user feedback have been ignored. These herding effects bias user feedback toward historical ratings, breaking down the assumption of unbiased feedback inherent in contextual bandits. This paper develops a novel variant of the contextual bandit that is tailored to address the feedback bias caused by the herding effects. A user feedback model is formulated to capture this feedback bias. We design the TS-Conf (Thompson Sampling under Conformity) algorithm, which employs posterior sampling to balance the exploration and exploitation tradeoff. We prove an upper bound for the regret of the algorithm, revealing the impact of herding effects on learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques

MethodsSoftmax · Attention Is All You Need