Unified Conversational Recommendation Policy Learning via Graph-based Reinforcement Learning
Yang Deng, Yaliang Li, Fei Sun, Bolin Ding, Wai Lam

TL;DR
This paper introduces a unified reinforcement learning approach using graph-based methods to improve conversational recommendation systems by integrating decision-making for asking attributes and recommending items, leading to better scalability and stability.
Contribution
It formulates CRS decision-making as a single policy learning task and develops a dynamic weighted graph RL method with action selection strategies for efficiency.
Findings
Significantly outperforms state-of-the-art methods on benchmark datasets.
Enhances scalability and stability of conversational recommender systems.
Effective in real-world E-Commerce application.
Abstract
Conversational recommender systems (CRS) enable the traditional recommender systems to explicitly acquire user preferences towards items and attributes through interactive conversations. Reinforcement learning (RL) is widely adopted to learn conversational recommendation policies to decide what attributes to ask, which items to recommend, and when to ask or recommend, at each conversation turn. However, existing methods mainly target at solving one or two of these three decision-making problems in CRS with separated conversation and recommendation components, which restrict the scalability and generality of CRS and fall short of preserving a stable training procedure. In the light of these challenges, we propose to formulate these three decision-making problems in CRS as a unified policy learning task. In order to systematically integrate conversation and recommendation components, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Topic Modeling
