TL;DR
This paper introduces USi, a user simulator for automatically evaluating conversational search systems by generating user responses to clarifying questions, reducing reliance on human evaluation and enabling multi-turn interaction studies.
Contribution
The paper presents USi, a GPT2-based user simulator for automatic evaluation of conversational search systems, and extends datasets for multi-turn interactions with crowdsourced data.
Findings
USi responses align well with user needs and match human answers.
USi performs accurately in single-turn clarifying question scenarios.
The approach facilitates multi-turn interaction research in conversational search.
Abstract
Clarifying the underlying user information need by asking clarifying questions is an important feature of modern conversational search system. However, evaluation of such systems through answering prompted clarifying questions requires significant human effort, which can be time-consuming and expensive. In this paper, we propose a conversational User Simulator, called USi, for automatic evaluation of such conversational search systems. Given a description of an information need, USi is capable of automatically answering clarifying questions about the topic throughout the search session. Through a set of experiments, including automated natural language generation metrics and crowdsourcing studies, we show that responses generated by USi are both inline with the underlying information need and comparable to human-generated answers. Moreover, we make the first steps towards multi-turn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
