DiverseDialogue: A Methodology for Designing Chatbots with Human-Like Diversity
Xiaoyu Lin, Xinkai Yu, Ankit Aich, Salvatore Giorgi, Lyle Ungar

TL;DR
This paper introduces a methodology to improve the human-likeness and diversity of chatbot conversations generated by LLMs, addressing discrepancies with real human interactions to enhance chatbot evaluation.
Contribution
It proposes an automatic prompt generation approach that incorporates real human interaction features to significantly improve linguistic diversity in chatbot simulations.
Findings
54% reduction in feature error between human and LLM conversations
Enhanced linguistic diversity in chatbot interactions
Improved evaluation of user-facing chatbots
Abstract
Large Language Models (LLMs), which simulate human users, are frequently employed to evaluate chatbots in applications such as tutoring and customer service. Effective evaluation necessitates a high degree of human-like diversity within these simulations. In this paper, we demonstrate that conversations generated by GPT-4o mini, when used as simulated human participants, systematically differ from those between actual humans across multiple linguistic features. These features include topic variation, lexical attributes, and both the average behavior and diversity (variance) of the language used. To address these discrepancies, we propose an approach that automatically generates prompts for user simulations by incorporating features derived from real human interactions, such as age, gender, emotional tone, and the topics discussed. We assess our approach using differential language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions
Methodstravel james
