Identifying Breakdowns in Conversational Recommender Systems using User   Simulation

Nolwenn Bernard; Krisztian Balog

arXiv:2405.14249·cs.IR·May 24, 2024

Identifying Breakdowns in Conversational Recommender Systems using User Simulation

Nolwenn Bernard, Krisztian Balog

PDF

1 Repo

TL;DR

This paper introduces a systematic method using user simulation to identify and analyze conversational breakdowns in recommender systems, aiming to improve robustness efficiently.

Contribution

It presents a novel diagnostic methodology leveraging user simulation to detect and characterize conversational breakdowns in recommender systems.

Findings

01

Effective identification of breakdown types

02

Improved system robustness after few iterations

03

Cost-effective and efficient testing process

Abstract

We present a methodology to systematically test conversational recommender systems with regards to conversational breakdowns. It involves examining conversations generated between the system and simulated users for a set of pre-defined breakdown types, extracting responsible conversational paths, and characterizing them in terms of the underlying dialogue intents. User simulation offers the advantages of simplicity, cost-effectiveness, and time efficiency for obtaining conversations where potential breakdowns can be identified. The proposed methodology can be used as diagnostic tool as well as a development tool to improve conversational recommendation systems. We apply our methodology in a case study with an existing conversational recommender system and user simulator, demonstrating that with just a few iterations, we can make the system more robust to conversational breakdowns.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nob0/crs-breakdown-detection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training