Simulating User Diversity in Task-Oriented Dialogue Systems using Large Language Models
Adnan Ahmad, Stefan Hillmann, Sebastian M\"oller

TL;DR
This paper demonstrates how large language models can generate diverse synthetic users for task-oriented dialogue systems, enabling detailed analysis of user behavior and system performance.
Contribution
It introduces a novel LLM-based user simulation approach that creates diverse user profiles and evaluates dialogue success, advancing research in dialogue system testing.
Findings
GPT-o1 produces more diverse user profiles.
GPT-4o generates more skewed user attributes.
Synthetic user profiles enable detailed dialogue analysis.
Abstract
In this study, we explore the application of Large Language Models (LLMs) for generating synthetic users and simulating user conversations with a task-oriented dialogue system and present detailed results and their analysis. We propose a comprehensive novel approach to user simulation technique that uses LLMs to create diverse user profiles, set goals, engage in multi-turn dialogues, and evaluate the conversation success. We employ two proprietary LLMs, namely GPT-4o and GPT-o1 (Achiam et al., 2023), to generate a heterogeneous base of user profiles, characterized by varied demographics, multiple user goals, different conversational styles, initial knowledge levels, interests, and conversational objectives. We perform a detailed analysis of the user profiles generated by LLMs to assess the diversity, consistency, and potential biases inherent in these LLM-generated user simulations. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Topic Modeling
