Deep Lexical Hypothesis: Identifying personality structure in natural language
Andrew Cutler, David M. Condon

TL;DR
This paper presents a novel NLP-based method to extract personality structure from language models, aligning closely with traditional survey-based findings and enabling large-scale, multilingual personality analysis.
Contribution
It introduces a scalable, language-model-driven approach to identify personality traits from natural language, matching survey results and applicable across many languages and large datasets.
Findings
High correlation with survey-based personality structures (coefficients 0.79-0.89)
Robustness across different adjective sets and language models
Weak recovery of Neuroticism and Openness traits
Abstract
Recent advances in natural language processing (NLP) have produced general models that can perform complex tasks such as summarizing long passages and translating across languages. Here, we introduce a method to extract adjective similarities from language models as done with survey-based ratings in traditional psycholexical studies but using millions of times more text in a natural setting. The correlational structure produced through this method is highly similar to that of self- and other-ratings of 435 terms reported by Saucier and Goldberg (1996a). The first three unrotated factors produced using NLP are congruent with those in survey data, with coefficients of 0.89, 0.79, and 0.79. This structure is robust to many modeling decisions: adjective set, including those with 1,710 terms (Goldberg, 1982) and 18,000 terms (Allport & Odbert, 1936); the query used to extract correlations;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersonality Traits and Psychology · Mental Health Research Topics · Cognitive Abilities and Testing
