Challenging the Validity of Personality Tests for Large Language Models

Tom S\"uhr; Florian E. Dorner; Samira Samadi; Augustin Kelava

arXiv:2311.05297·cs.CL·June 6, 2024·5 cites

Challenging the Validity of Personality Tests for Large Language Models

Tom S\"uhr, Florian E. Dorner, Samira Samadi, Augustin Kelava

PDF

Open Access

TL;DR

This paper demonstrates that personality tests designed for humans are not valid for large language models, as LLM responses deviate systematically and do not align with human personality structures.

Contribution

The study provides empirical evidence that existing human personality assessments are invalid for LLMs, highlighting the need for new evaluation methods.

Findings

01

LLMs often affirm both sides of reverse-coded items

02

Prompt variations do not produce clear personality factor separation

03

Responses to personality tests differ systematically from human responses

Abstract

With large language models (LLMs) like GPT-4 appearing to behave increasingly human-like in text-based interactions, it has become popular to attempt to evaluate personality traits of LLMs using questionnaires originally developed for humans. While reusing measures is a resource-efficient way to evaluate LLMs, careful adaptations are usually required to ensure that assessment results are valid even across human subpopulations. In this work, we provide evidence that LLMs' responses to personality tests systematically deviate from human responses, implying that the results of these tests cannot be interpreted in the same way. Concretely, reverse-coded items ("I am introverted" vs. "I am extraverted") are often both answered affirmatively. Furthermore, variation across prompts designed to "steer" LLMs to simulate particular personality types does not follow the clear separation into five…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques

MethodsAttention Is All You Need · Softmax · Layer Normalization · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention