Automatic Extraction of Personality from Text: Challenges and   Opportunities

Nazar Akrami; Johan Fernquist; Tim Isbister; Lisa Kaati; and Bj\"orn; Pelzer

arXiv:1910.09916·cs.CL·October 23, 2019

Automatic Extraction of Personality from Text: Challenges and Opportunities

Nazar Akrami, Johan Fernquist, Tim Isbister, Lisa Kaati, and Bj\"orn, Pelzer

PDF

TL;DR

This paper investigates the challenges of extracting personality traits from text using machine learning, highlighting the importance of high-quality annotated data and the difficulties of model generalization in real-world scenarios.

Contribution

The study provides a comprehensive dataset with expert annotations and evaluates various models, revealing the limitations of current approaches in real-world personality prediction from text.

Findings

01

Models trained on high-reliability data outperform those trained on low-reliability data.

02

Language models perform better than baselines on high-quality datasets.

03

Models do not generalize well in real-world settings, performing no better than random chance.

Abstract

In this study, we examined the possibility to extract personality traits from a text. We created an extensive dataset by having experts annotate personality traits in a large number of texts from multiple online sources. From these annotated texts, we selected a sample and made further annotations ending up in a large low-reliability dataset and a small high-reliability dataset. We then used the two datasets to train and test several machine learning models to extract personality from text, including a language model. Finally, we evaluated our best models in the wild, on datasets from different domains. Our results show that the models based on the small high-reliability dataset performed better (in terms of $R^{2}$ ) than models based on large low-reliability dataset. Also, language model based on small high-reliability dataset performed better than the random baseline. Finally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest