Personality Detection and Analysis using Twitter Data
Abhilash Datta, Souvic Chakraborty, Animesh Mukherjee

TL;DR
This paper introduces a large-scale Twitter dataset with 152 million tweets for Myers-Briggs personality prediction, providing extensive analysis and baseline evaluations to advance computational personality detection.
Contribution
The paper presents the largest curated Twitter dataset for personality analysis and offers comprehensive qualitative and quantitative studies along with baseline performance evaluations.
Findings
Data patterns align with natural intuition
Baseline models provide initial performance benchmarks
Extensive dataset enables better understanding of personality signals
Abstract
Personality types are important in various fields as they hold relevant information about the characteristics of a human being in an explainable format. They are often good predictors of a person's behaviors in a particular environment and have applications ranging from candidate selection to marketing and mental health. Recently automatic detection of personality traits from texts has gained significant attention in computational linguistics. Most personality detection and analysis methods have focused on small datasets making their experimental observations often limited. To bridge this gap, we focus on collecting and releasing the largest automatically curated dataset for the research community which has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task. We perform a series of extensive qualitative and quantitative studies on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersonality Traits and Psychology · Mental Health via Writing · Gambling Behavior and Treatments
MethodsFocus
