RecoGym: A Reinforcement Learning Environment for the problem of Product   Recommendation in Online Advertising

David Rohde; Stephen Bonner; Travis Dunlop; Flavian Vasile; Alexandros; Karatzoglou

arXiv:1808.00720·cs.IR·September 17, 2018·60 cites

RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising

David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, Alexandros, Karatzoglou

PDF

Open Access 1 Repo

TL;DR

RecoGym is a reinforcement learning environment designed for product recommendation in online advertising, aiming to improve the alignment between offline metrics and online performance by modeling user interactions.

Contribution

The paper introduces RecoGym, a novel RL environment for recommendation systems based on user traffic and response models, fostering collaboration between RL and recommender systems.

Findings

01

Provides a simulation environment for RL-based recommendation research.

02

Facilitates testing of long-term optimization strategies.

03

Aims to bridge the gap between offline metrics and online performance.

Abstract

Recommender Systems are becoming ubiquitous in many settings and take many forms, from product recommendation in e-commerce stores, to query suggestions in search engines, to friend recommendation in social networks. Current research directions which are largely based upon supervised learning from historical data appear to be showing diminishing returns with a lot of practitioners report a discrepancy between improvements in offline metrics for supervised learning and the online performance of the newly proposed models. One possible reason is that we are using the wrong paradigm: when looking at the long-term cycle of collecting historical performance data, creating a new version of the recommendation model, A/B testing it and then rolling it out. We see that there a lot of commonalities with the reinforcement learning (RL) setup, where the agent observes the environment and acts upon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

criteo-research/reco-gym
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Smart Grid Energy Management