Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems
Hojoon Lee, Dongyoon Hwang, Kyushik Min, Jaegul Choo

TL;DR
This paper critically examines the effectiveness of using review datasets for evaluating long-term user feedback in Interactive Recommender Systems, revealing that simple models often outperform complex RL-based approaches in such evaluations.
Contribution
It highlights the limitations of current review datasets for long-term feedback evaluation and demonstrates the importance of including simple baselines in RL-based IRS assessments.
Findings
Simple greedy reward models outperform RL-based models in cumulative reward maximization.
Higher weighting of long-term rewards degrades recommendation performance.
User feedbacks show limited long-term effects in benchmark datasets.
Abstract
Interactive Recommender Systems (IRSs) have attracted a lot of attention, due to their ability to model interactive processes between users and recommender systems. Numerous approaches have adopted Reinforcement Learning (RL) algorithms, as these can directly maximize users' cumulative rewards. In IRS, researchers commonly utilize publicly available review datasets to compare and evaluate algorithms. However, user feedback provided in public datasets merely includes instant responses (e.g., a rating), with no inclusion of delayed responses (e.g., the dwell time and the lifetime value). Thus, the question remains whether these review datasets are an appropriate choice to evaluate the long-term effects of the IRS. In this work, we revisited experiments on IRS with review datasets and compared RL-based models with a simple reward model that greedily recommends the item with the highest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
