Expectation Confirmation Preference Optimization for Multi-Turn Conversational Recommendation Agent
Xueyang Feng, Jingsen Zhang, Jiakai Tang, Wei Li, Guohao Cai, Xu Chen, Quanyu Dai, Yue Zhu, Zhenhua Dong

TL;DR
This paper introduces ECPO, a novel multi-turn preference optimization method for conversational recommendation agents that models user satisfaction evolution and improves response quality efficiently.
Contribution
The paper proposes ECPO, a new multi-turn preference optimization paradigm based on Expectation Confirmation Theory, reducing sampling overhead and enhancing multi-turn dialogue performance.
Findings
ECPO outperforms existing MTPO methods in efficiency and effectiveness.
The LLM-based user simulator AILO effectively supports expectation confirmation.
ECPO significantly improves user satisfaction in multi-turn conversations.
Abstract
Recent advancements in Large Language Models (LLMs) have significantly propelled the development of Conversational Recommendation Agents (CRAs). However, these agents often generate short-sighted responses that fail to sustain user guidance and meet expectations. Although preference optimization has proven effective in aligning LLMs with user expectations, it remains costly and performs poorly in multi-turn dialogue. To address this challenge, we introduce a novel multi-turn preference optimization (MTPO) paradigm ECPO, which leverages Expectation Confirmation Theory to explicitly model the evolution of user satisfaction throughout multi-turn dialogues, uncovering the underlying causes of dissatisfaction. These causes can be utilized to support targeted optimization of unsatisfactory responses, thereby achieving turn-level preference optimization. ECPO ingeniously eliminates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining
