Offline Evaluation for Reinforcement Learning-based Recommendation: A   Critical Issue and Some Alternatives

Romain Deffayet; Thibaut Thonet; Jean-Michel Renders; Maarten de Rijke

arXiv:2301.00993·cs.IR·January 4, 2023·5 cites

Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives

Romain Deffayet, Thibaut Thonet, Jean-Michel Renders, Maarten de Rijke

PDF

Open Access

TL;DR

This paper critiques current offline evaluation methods for reinforcement learning-based recommender systems, highlighting their shortcomings and proposing alternative evaluation approaches to better reflect RL benefits.

Contribution

It identifies the limitations of next-item prediction protocols in offline RL recommendation evaluation and suggests new methods to improve assessment reliability.

Findings

01

Current evaluation protocols hide RL deficiencies

02

Next-item prediction does not reflect RL benefits

03

Proposed alternatives aim for more reliable evaluation

Abstract

In this paper, we argue that the paradigm commonly adopted for offline evaluation of sequential recommender systems is unsuitable for evaluating reinforcement learning-based recommenders. We find that most of the existing offline evaluation practices for reinforcement learning-based recommendation are based on a next-item prediction protocol, and detail three shortcomings of such an evaluation protocol. Notably, it cannot reflect the potential benefits that reinforcement learning (RL) is expected to bring while it hides critical deficiencies of certain offline RL agents. Our suggestions for alternative ways to evaluate RL-based recommender systems aim to shed light on the existing possibilities and inspire future research on reliable evaluation protocols.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Smart Grid Energy Management