From Clicks to Conversions: Recommendation for long-term reward
Philom\`ene Chagniot, Flavian Vasile, David Rohde

TL;DR
This paper introduces a framework for modeling long-term rewards in recommender systems, highlighting issues with last-click attribution and proposing an extension that improves long-term metric optimization.
Contribution
It presents a new framework for long-term reward modeling in recommender systems and demonstrates an extension that enhances long-term performance in simulations.
Findings
Identifies problems with last-click attribution in conversion optimization
Proposes an extension that improves long-term reward modeling
Achieves state-of-the-art results in simulation environment
Abstract
Recommender systems are often optimised for short-term reward: a recommendation is considered successful if a reward (e.g. a click) can be observed immediately after the recommendation. The advantage of this framework is that with some reasonable (although questionable) assumptions, it allows familiar supervised learning tools to be used for the recommendation task. However, it means that long-term business metrics, e.g. sales or retention are ignored. In this paper we introduce a framework for modeling long-term rewards in the RecoGym simulation environment. We use this newly introduced functionality to showcase problems introduced by the last-click attribution scheme in the case of conversion-optimized recommendations and propose a simple extension that leads to state-of-the-art results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Web Data Mining and Analysis
