Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation
Xiaocong Chen, Siyu Wang, Lina Yao, Lianyong Qi, Yong Li

TL;DR
This paper introduces a novel intrinsically motivated reinforcement learning approach with counterfactual data augmentation to improve exploration and exploitation in sparse environment recommender systems, demonstrating superior performance in experiments.
Contribution
It proposes a new intrinsically motivated RL method combined with counterfactual augmentation to enhance exploration and exploitation in sparse environment recommender systems.
Findings
Outperforms existing state-of-the-art methods on six offline datasets.
Effective in balancing exploration and exploitation in sparse environments.
Shows improved recommendation accuracy in online simulations.
Abstract
Deep reinforcement learning (DRL) has been proven its efficiency in capturing users' dynamic interests in recent literature. However, training a DRL agent is challenging, because of the sparse environment in recommender systems (RS), DRL agents could spend times either exploring informative user-item interaction trajectories or using existing trajectories for policy learning. It is also known as the exploration and exploitation trade-off which affects the recommendation performance significantly when the environment is sparse. It is more challenging to balance the exploration and exploitation in DRL RS where RS agent need to deeply explore the informative trajectories and exploit them efficiently in the context of recommender systems. As a step to address this issue, We design a novel intrinsically ,otivated reinforcement learning method to increase the capability of exploring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research
