Robust Reinforcement Learning Objectives for Sequential Recommender Systems
Melissa Mozifian, Tristan Sylvain, Dave Evans, Lili Meng

TL;DR
This paper proposes robust reinforcement learning objectives for sequential recommender systems, addressing challenges like data imbalance and instability, leading to improved performance and robustness in real-world datasets.
Contribution
We introduce a novel contrastive-based reinforcement learning approach with augmentation and stability enhancements for sequential recommendation tasks.
Findings
Achieves state-of-the-art performance on multiple datasets.
Demonstrates increased robustness against data imbalance.
Addresses instability issues in offline RL settings.
Abstract
Attention-based sequential recommendation methods have shown promise in accurately capturing users' evolving interests from their past interactions. Recent research has also explored the integration of reinforcement learning (RL) into these models, in addition to generating superior user representations. By framing sequential recommendation as an RL problem with reward signals, we can develop recommender systems that incorporate direct user feedback in the form of rewards, enhancing personalization for users. Nonetheless, employing RL algorithms presents challenges, including off-policy training, expansive combinatorial action spaces, and the scarcity of datasets with sufficient reward signals. Contemporary approaches have attempted to combine RL and sequential modeling, incorporating contrastive-based objectives and negative sampling strategies for training the RL component. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Mind wandering and attention
