REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation
Dong-Ho Lee, Adyasha Maharana, Jay Pujara, Xiang Ren, Francesco, Barbieri

TL;DR
REALTALK is a 21-day real-world messaging dataset that enables research on long-term dialogue, emotional intelligence, and persona consistency, highlighting challenges in memory and persona modeling in chatbots.
Contribution
Introduces the first authentic 21-day messaging dataset, REALTALK, and proposes benchmark tasks for persona simulation and memory probing in real-world conversations.
Findings
Models struggle with persona simulation from dialogue history.
Fine-tuning improves persona emulation.
Models face challenges in long-term memory recall.
Abstract
Long-term, open-domain dialogue capabilities are essential for chatbots aiming to recall past interactions and demonstrate emotional intelligence (EI). Yet, most existing research relies on synthetic, LLM-generated data, leaving open questions about real-world conversational patterns. To address this gap, we introduce REALTALK, a 21-day corpus of authentic messaging app dialogues, providing a direct benchmark against genuine human interactions. We first conduct a dataset analysis, focusing on EI attributes and persona consistency to understand the unique challenges posed by real-world dialogues. By comparing with LLM-generated conversations, we highlight key differences, including diverse emotional expressions and variations in persona stability that synthetic dialogues often fail to capture. Building on these insights, we introduce two benchmark tasks: (1) persona simulation where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
