Retentive Decision Transformer with Adaptive Masking for Reinforcement Learning based Recommendation Systems
Siyu Wang, Xiaocong Chen, Lina Yao

TL;DR
This paper introduces a novel offline reinforcement learning-based recommender system that employs adaptive masking and a multi-scale retention mechanism to improve sequence modeling and computational efficiency.
Contribution
It proposes a new method that models sequential decision-making as an inference task with adaptive masking and multi-scale retention, addressing challenges of sequence length and resource use.
Findings
Outperforms existing methods on online and offline datasets.
Enhances sequence modeling for evolving user preferences.
Reduces computational costs for long sequence processing.
Abstract
Reinforcement Learning-based Recommender Systems (RLRS) have shown promise across a spectrum of applications, from e-commerce platforms to streaming services. Yet, they grapple with challenges, notably in crafting reward functions and harnessing large pre-existing datasets within the RL framework. Recent advancements in offline RLRS provide a solution for how to address these two challenges. However, existing methods mainly rely on the transformer architecture, which, as sequence lengths increase, can introduce challenges associated with computational resources and training costs. Additionally, the prevalent methods employ fixed-length input trajectories, restricting their capacity to capture evolving user preferences. In this study, we introduce a new offline RLRS method to deal with the above problems. We reinterpret the RLRS challenge by modeling sequential decision-making as an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpinion Dynamics and Social Influence · Data Stream Mining Techniques · Recommender Systems and Techniques
MethodsL1 Regularization · Adaptive Masking
