Model-agnostic Counterfactual Synthesis Policy for Interactive Recommendation
Siyu Wang, Xiaocong Chen, Lina Yao

TL;DR
This paper introduces a model-agnostic counterfactual synthesis policy that generates synthetic data to mitigate data sparsity in interactive recommendation systems, enhancing reinforcement learning performance.
Contribution
It proposes a novel counterfactual synthesis policy applicable to any RL-based recommendation system, addressing data sparsity by modeling from observation and counterfactual distributions.
Findings
Effective in generating synthetic data to improve RL training
Demonstrates generality across different RL algorithms
Shows improved recommendation performance in experiments
Abstract
Interactive recommendation is able to learn from the interactive processes between users and systems to confront the dynamic interests of users. Recent advances have convinced that the ability of reinforcement learning to handle the dynamic process can be effectively applied in the interactive recommendation. However, the sparsity of interactive data may hamper the performance of the system. We propose to train a Model-agnostic Counterfactual Synthesis Policy to generate counterfactual data and address the data sparsity problem by modelling from observation and counterfactual distribution. The proposed policy can identify and replace the trivial components for any state in the training process with other agents, which can be deployed in any RL-based algorithm. The experimental results demonstrate the effectiveness and generality of our proposed policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Energy Load and Power Forecasting
