Unreliable Multi-Armed Bandits: A Novel Approach to Recommendation   Systems

Aditya Narayan Ravi; Pranav Poduval; Sharayu Moharir

arXiv:1911.06239·stat.ML·September 5, 2024

Unreliable Multi-Armed Bandits: A Novel Approach to Recommendation Systems

Aditya Narayan Ravi, Pranav Poduval, Sharayu Moharir

PDF

TL;DR

This paper introduces a new model for recommendation systems using a modified multi-armed bandit approach where the bandit interacts with an unreliable intermediate modeled as a Markov chain, aiming to optimize recommendations despite the intermediate's autonomy.

Contribution

The paper proposes a novel bandit model with an unreliable intermediate, providing theoretical foundations and a close-to-optimal algorithm for this setting.

Findings

01

Proved fundamental theorems for the unreliable bandit model.

02

Developed a close-to-optimal Explore-Commit algorithm.

03

Demonstrated the effectiveness of the approach in recommendation scenarios.

Abstract

We use a novel modification of Multi-Armed Bandits to create a new model for recommendation systems. We model the recommendation system as a bandit seeking to maximize reward by pulling on arms with unknown rewards. The catch however is that this bandit can only access these arms through an unreliable intermediate that has some level of autonomy while choosing its arms. For example, in a streaming website the user has a lot of autonomy while choosing content they want to watch. The streaming sites can use targeted advertising as a means to bias opinions of these users. Here the streaming site is the bandit aiming to maximize reward and the user is the unreliable intermediate. We model the intermediate as accessing states via a Markov chain. The bandit is allowed to perturb this Markov chain. We prove fundamental theorems for this setting after which we show a close-to-optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.