Sequential Relevance Maximization with Binary Feedback
Vijay Kamble, Nadia Fawaz, Fernando Silveira

TL;DR
This paper studies a sequential recommendation problem with binary user feedback, aiming to optimize relevance while learning user preferences, and proposes algorithms with provable performance guarantees.
Contribution
It characterizes the optimal policy structure and introduces heuristic algorithms with strong theoretical performance bounds for relevance maximization.
Findings
Heuristic policies perform close to optimal in simulations.
The optimal policy has specific structural properties.
Proposed algorithms balance exploration and exploitation effectively.
Abstract
Motivated by online settings where users can provide explicit feedback about the relevance of products that are sequentially presented to them, we look at the recommendation process as a problem of dynamically optimizing this relevance feedback. Such an algorithm optimizes the fine tradeoff between presenting the products that are most likely to be relevant, and learning the preferences of the user so that more relevant recommendations can be made in the future. We assume a standard predictive model inspired by collaborative filtering, in which a user is sampled from a distribution over a set of possible types. For every product category, each type has an associated relevance feedback that is assumed to be binary: the category is either relevant or irrelevant. Assuming that the user stays for each additional recommendation opportunity with probability independent of the past,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Optimization and Search Problems
