EventChat: Implementation and user-centric evaluation of a large language model-driven conversational recommender system for exploring leisure events in an SME context
Hannes Kunstmann, Joseph Ollier, Joel Persson, Florian von Wangenheim

TL;DR
This paper presents the design, evaluation, and strategic considerations of an LLM-driven conversational recommender system for SMEs, highlighting performance, cost, and latency challenges with practical insights.
Contribution
It introduces a user-centric evaluation framework and a revised ResQue model for assessing LLM-driven CRS in SME contexts, emphasizing real-world performance and strategic implications.
Findings
85.5% recommendation accuracy achieved
Median cost per interaction is $0.04
Latency is approximately 5.7 seconds
Abstract
Large language models (LLMs) present an enormous evolution in the strategic potential of conversational recommender systems (CRS). Yet to date, research has predominantly focused upon technical frameworks to implement LLM-driven CRS, rather than end-user evaluations or strategic implications for firms, particularly from the perspective of a small to medium enterprises (SME) that makeup the bedrock of the global economy. In the current paper, we detail the design of an LLM-driven CRS in an SME setting, and its subsequent performance in the field using both objective system metrics and subjective user evaluations. While doing so, we additionally outline a short-form revised ResQue model for evaluating LLM-driven CRS, enabling replicability in a rapidly evolving field. Our results reveal good system performance from a user experience perspective (85.5% recommendation accuracy) but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
