MCPDial: A Minecraft Persona-driven Dialogue Dataset
Seyed Hossein Alavi, Sudha Rao, Ashutosh Adhikari, Gabriel A, DesGarennes, Akanksha Malhotra, Chris Brockett, Mahmoud Adada, Raymond T. Ng,, Vered Shwartz, Bill Dolan

TL;DR
MCPDial is a new dataset of extensive, persona-driven dialogues in Minecraft, generated using LLMs from a small seed, including rich character descriptions and canonical function calls, for advancing dialogue systems.
Contribution
We introduce MCPDial, a novel dataset created with LLMs that features long, persona-driven Minecraft conversations with rich character info and function calls, expanding dialogue resources.
Findings
Generated hundreds of conversations from a small seed.
Includes detailed character descriptions and function calls.
Qualitative analysis confirms dataset quality and diversity.
Abstract
We propose a novel approach that uses large language models (LLMs) to generate persona-driven conversations between Players and Non-Player Characters (NPC) in games. Showcasing the application of our methodology, we introduce the Minecraft Persona-driven Dialogue dataset (MCPDial). Starting with a small seed of expert-written conversations, we employ our method to generate hundreds of additional conversations. Each conversation in the dataset includes rich character descriptions of the player and NPC. The conversations are long, allowing for in-depth and extensive interactions between the player and NPC. MCPDial extends beyond basic conversations by incorporating canonical function calls (e.g. "Call find a resource on iron ore") between the utterances. Finally, we conduct a qualitative analysis of the dataset to assess its quality and characteristics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · Topic Modeling
