Big5PersonalityEssays: Introducing a Novel Synthetic Generated Dataset Consisting of Short State-of-Consciousness Essays Annotated Based on the Five Factor Model of Personality
Iustin Floroiu

TL;DR
This paper introduces a synthetic dataset of short essays annotated with the Five Factor Model of personality traits, aiming to facilitate AI and psychological research despite limited existing data.
Contribution
It presents a novel synthetic dataset of personality-annotated essays to support AI and psychological studies in a data-scarce field.
Findings
Created a large, labeled synthetic essay dataset
Demonstrated potential for AI-based personality analysis
Facilitates research in psychology and NLP
Abstract
Given the high advances of large language models (LLM) it is of vital importance to study their behaviors and apply their utility in all kinds of scientific fields. Psychology has been, in recent years, poorly approached using novel computational tools. One of the reasons is the high complexity of the data required for a proper analysis. Moreover, psychology, with a focus on psychometry, has few datasets available for analysis and artificial intelligence usage. Because of these facts, this study introduces a synthethic database of short essays labeled based on the five factor model (FFM) of personality traits.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health Research Topics
MethodsFocus
