Clustered Bandits
Loc Bui, Ramesh Johari, Shie Mannor

TL;DR
This paper introduces algorithms for multi-armed bandit problems that leverage user clustering to improve decision-making in e-commerce, demonstrating the benefits of exploiting user type similarities.
Contribution
It proposes novel algorithms combining clustering with exploration-exploitation, and analyzes simple algorithms for known user type sets but unknown identities.
Findings
Clustering improves bandit performance in user response prediction.
Algorithms outperform non-clustering baselines in simulated e-commerce scenarios.
Demonstrates the value of user type knowledge in multi-armed bandit settings.
Abstract
We consider a multi-armed bandit setting that is inspired by real-world applications in e-commerce. In our setting, there are a few types of users, each with a specific response to the different arms. When a user enters the system, his type is unknown to the decision maker. The decision maker can either treat each user separately ignoring the previously observed users, or can attempt to take advantage of knowing that only few types exist and cluster the users according to their response to the arms. We devise algorithms that combine the usual exploration-exploitation tradeoff with clustering of users and demonstrate the value of clustering. In the process of developing algorithms for the clustered setting, we propose and analyze simple algorithms for the setup where a decision maker knows that a user belongs to one of few types, but does not know which one.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems
