SimLab: A Platform for Simulation-based Evaluation of Conversational Information Access Systems
Nolwenn Bernard, Sharath Chandra Etagi Suresh, Krisztian Balog, ChengXiang Zhai

TL;DR
SimLab is a cloud-based platform that enables reproducible benchmarking of conversational information access systems and user simulators, addressing a key barrier in evaluating CIA progress.
Contribution
This paper introduces SimLab, the first centralized infrastructure for simulation-based evaluation of CIA systems and user simulators, fostering reproducibility and community collaboration.
Findings
Successfully implemented initial version of SimLab
Demonstrated platform's capabilities with conversational movie recommendation
Facilitated controlled, reproducible experiments in CIA evaluation
Abstract
Progress in conversational information access (CIA) systems has been hindered by the difficulty of evaluating such systems with reproducible experiments. While user simulation offers a promising solution, the lack of infrastructure and tooling to support this evaluation paradigm remains a significant barrier. To address this gap, we introduce SimLab, the first cloud-based platform providing a centralized solution for the community to benchmark both conversational systems and user simulators in a controlled and reproducible setting. We articulate the requirements for such a platform and propose a general infrastructure to meet them. We then present the design and implementation of an initial version of SimLab and showcase its features through an initial simulation-based evaluation task in conversational movie recommendation. Furthermore, we discuss the platform's sustainability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimedia Communication and Technology · Speech and dialogue systems · Service-Oriented Architecture and Web Services
