Evaluating Conversational Recommender Systems: A Landscape of Research
Dietmar Jannach

TL;DR
This paper reviews evaluation methods for conversational recommender systems, highlighting challenges in assessing complex multi-component systems and proposing future directions for more comprehensive evaluation approaches.
Contribution
It provides a comprehensive overview of current evaluation techniques, discusses their limitations, and suggests future research directions for holistic assessment of conversational recommenders.
Findings
Existing evaluation methods are often limited to either objective or subjective measures.
Holistic evaluation of conversational recommenders remains a significant challenge.
Future research should focus on integrated evaluation frameworks combining multiple assessment techniques.
Abstract
Conversational recommender systems aim to interactively support online users in their information search and decision-making processes in an intuitive way. With the latest advances in voice-controlled devices, natural language processing, and AI in general, such systems received increased attention in recent years. Technically, conversational recommenders are usually complex multi-component applications and often consist of multiple machine learning models and a natural language user interface. Evaluating such a complex system in a holistic way can therefore be challenging, as it requires (i) the assessment of the quality of the different learning components, and (ii) the quality perception of the system as a whole by users. Thus, a mixed methods approach is often required, which may combine objective (computational) and subjective (perception-oriented) evaluation techniques. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
