RLT4Rec: Reinforcement Learning Transformer for User Cold Start and Item Recommendation
Dilina Chandika Rajapakse, Douglas Leith

TL;DR
RLT4Rec is a transformer-based reinforcement learning model that effectively handles user cold start and item recommendation by generating personalized sequences without explicit state inputs, demonstrating robustness and efficiency.
Contribution
The paper introduces RLT4Rec, a simple transformer RL architecture that manages new and existing users within a unified framework, automatically balancing exploration and exploitation.
Findings
Achieves excellent performance across recommendation tasks.
Handles new and established users seamlessly.
Training is robust, fast, and data-insensitive.
Abstract
We introduce a new sequential transformer reinforcement learning architecture RLT4Rec and demonstrate that it achieves excellent performance in a range of item recommendation tasks. RLT4Rec uses a relatively simple transformer architecture that takes as input the user's (item,rating) history and outputs the next item to present to the user. Unlike existing RL approaches, there is no need to input a state observation or estimate. RLT4Rec handles new users and established users within the same consistent framework and automatically balances the "exploration" needed to discover the preferences of a new user with the "exploitation" that is more appropriate for established users. Training of RLT4Rec is robust and fast and is insensitive to the choice of training data, learning to generate "good" personalised sequences that the user tends to rate highly even when trained on "bad" data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Data Stream Mining Techniques · Advanced Bandit Algorithms Research
