Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling
Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun, Chen, Huifeng Guo, Yuzhou Zhang

TL;DR
This paper introduces DRR, a deep reinforcement learning framework for recommendation systems that models dynamic user-item interactions and optimizes long-term rewards, outperforming existing methods.
Contribution
The paper proposes a novel DRR framework using Actor-Critic reinforcement learning to explicitly model user-item interactions and long-term rewards in recommendations.
Findings
DRR outperforms state-of-the-art methods in experiments.
Explicit interaction modeling improves recommendation quality.
Long-term reward optimization enhances user satisfaction.
Abstract
Recommendation is crucial in both academia and industry, and various techniques are proposed such as content-based collaborative filtering, matrix factorization, logistic regression, factorization machines, neural networks and multi-armed bandits. However, most of the previous studies suffer from two limitations: (1) considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems, (2) focusing on the immediate feedback of recommended items and neglecting the long-term rewards. To address the two limitations, in this paper we propose a novel recommendation framework based on deep reinforcement learning, called DRR. The DRR framework treats recommendation as a sequential decision making procedure and adopts an "Actor-Critic" reinforcement learning scheme to model the interactions between the users and recommender…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Smart Grid Energy Management
