Beyond Discrete Personas: Personality Modeling Through Journal Intensive Conversations
Sayantan Pal, Souvik Das, Rohini K. Srihari

TL;DR
This paper introduces a large, novel dataset of personalized dialogues based on journal entries, capturing evolving human personalities and improving LLMs' ability to generate authentic, personality-rich conversations.
Contribution
The authors create a new dataset from Reddit journal entries, cluster and filter data by personality traits, and fine-tune LLMs to produce more authentic, personality-aligned dialogues.
Findings
11% improvement in personality trait accuracy
Enhanced coherence and personality richness in generated dialogues
Effective clustering and filtering of journal data
Abstract
Large Language Models (LLMs) have significantly improved personalized conversational capabilities. However, existing datasets like Persona Chat, Synthetic Persona Chat, and Blended Skill Talk rely on static, predefined personas. This approach often results in dialogues that fail to capture human personalities' fluid and evolving nature. To overcome these limitations, we introduce a novel dataset with around 400,000 dialogues and a framework for generating personalized conversations using long-form journal entries from Reddit. Our approach clusters journal entries for each author and filters them by selecting the most representative cluster, ensuring that the retained entries best reflect the author's personality. We further refine the data by capturing the Big Five personality traits --openness, conscientiousness, extraversion, agreeableness, and neuroticism --ensuring that dialogues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · Language, Metaphor, and Cognition · Innovative Human-Technology Interaction
MethodsLLaMA
