Persona-Based Synthetic Data Generation Using Multi-Stage Conditioning with Large Language Models for Emotion Recognition
Keito Inoshita, Rushia Harada

TL;DR
This paper presents PersonaGen, a multi-stage persona-based framework using large language models to generate diverse, realistic emotional text for improving emotion recognition models, addressing data scarcity issues.
Contribution
Introducing PersonaGen, a novel multi-stage conditioning approach with large language models for synthesizing emotionally rich, diverse text data based on layered virtual personas.
Findings
PersonaGen outperforms baselines in diversity and realism.
Synthetic data improves emotion classification accuracy.
Framework effectively captures subjective emotional expressions.
Abstract
In the field of emotion recognition, the development of high-performance models remains a challenge due to the scarcity of high-quality, diverse emotional datasets. Emotional expressions are inherently subjective, shaped by individual personality traits, socio-cultural backgrounds, and contextual factors, making large-scale, generalizable data collection both ethically and practically difficult. To address this issue, we introduce PersonaGen, a novel framework for generating emotionally rich text using a Large Language Model (LLM) through multi-stage persona-based conditioning. PersonaGen constructs layered virtual personas by combining demographic attributes, socio-cultural backgrounds, and detailed situational contexts, which are then used to guide emotion expression generation. We conduct comprehensive evaluations of the generated synthetic data, assessing semantic diversity through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · Sentiment Analysis and Opinion Mining · Mental Health via Writing
