Model-agnostic Counterfactual Synthesis Policy for Interactive   Recommendation

Siyu Wang; Xiaocong Chen; Lina Yao

arXiv:2204.00308·cs.IR·April 4, 2022·1 cites

Model-agnostic Counterfactual Synthesis Policy for Interactive Recommendation

Siyu Wang, Xiaocong Chen, Lina Yao

PDF

Open Access

TL;DR

This paper introduces a model-agnostic counterfactual synthesis policy that generates synthetic data to mitigate data sparsity in interactive recommendation systems, enhancing reinforcement learning performance.

Contribution

It proposes a novel counterfactual synthesis policy applicable to any RL-based recommendation system, addressing data sparsity by modeling from observation and counterfactual distributions.

Findings

01

Effective in generating synthetic data to improve RL training

02

Demonstrates generality across different RL algorithms

03

Shows improved recommendation performance in experiments

Abstract

Interactive recommendation is able to learn from the interactive processes between users and systems to confront the dynamic interests of users. Recent advances have convinced that the ability of reinforcement learning to handle the dynamic process can be effectively applied in the interactive recommendation. However, the sparsity of interactive data may hamper the performance of the system. We propose to train a Model-agnostic Counterfactual Synthesis Policy to generate counterfactual data and address the data sparsity problem by modelling from observation and counterfactual distribution. The proposed policy can identify and replace the trivial components for any state in the training process with other agents, which can be deployed in any RL-based algorithm. The experimental results demonstrate the effectiveness and generality of our proposed policy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Energy Load and Power Forecasting