Locality-Sensitive Experience Replay for Online Recommendation
Xiaocong Chen, Lina Yao, Xianzhi Wang, Julian McAuley

TL;DR
This paper introduces a state-aware, locality-sensitive experience replay method for deep reinforcement learning in online recommendation systems, improving training efficiency and recommendation quality amidst complex, dynamic environments.
Contribution
It proposes a novel experience replay model combining locality-sensitive hashing and prioritized sampling to enhance learning in online recommender systems.
Findings
Outperforms existing experience replay methods in simulations
Effectively captures dynamic user preferences
Reduces training time and improves recommendation accuracy
Abstract
Online recommendation requires handling rapidly changing user preferences. Deep reinforcement learning (DRL) is gaining interest as an effective means of capturing users' dynamic interest during interactions with recommender systems. However, it is challenging to train a DRL agent, due to large state space (e.g., user-item rating matrix and user profiles), action space (e.g., candidate items), and sparse rewards. Existing studies encourage the agent to learn from past experience via experience replay (ER). They adapt poorly to the complex environment of online recommender systems and are inefficient in determining an optimal strategy from past experience. To address these issues, we design a novel state-aware experience replay model, which uses locality-sensitive hashing to map high dimensional data into low-dimensional representations and a prioritized reward-driven strategy to replay…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Reinforcement Learning in Robotics
MethodsExperience Replay
