Evaluating Large Language Models with Psychometrics

Yuan Li; Yue Huang; Hongyi Wang; Ying Cheng; Xiangliang Zhang; James Zou; Lichao Sun

arXiv:2406.17675·cs.CL·October 20, 2025·6 cites

Evaluating Large Language Models with Psychometrics

Yuan Li, Yue Huang, Hongyi Wang, Ying Cheng, Xiangliang Zhang, James Zou, Lichao Sun

PDF

Open Access

TL;DR

This paper introduces a psychometric benchmark to evaluate large language models' psychological traits, revealing discrepancies between self-reports and actual responses, and highlighting challenges in adapting human-centric tests for AI.

Contribution

It develops a comprehensive psychometric assessment framework for LLMs, identifying key psychological constructs and evaluating their behaviors across diverse scenarios.

Findings

01

Discrepancies between LLMs' self-reports and response patterns.

02

Some human-designed tests are unreliable for LLMs.

03

Insights into LLMs' psychological trait assessment.

Abstract

Large Language Models (LLMs) have demonstrated exceptional capabilities in solving various tasks, progressively evolving into general-purpose assistants. The increasing integration of LLMs into society has sparked interest in whether they exhibit psychological patterns, and whether these patterns remain consistent across different contexts -- questions that could deepen the understanding of their behaviors. Inspired by psychometrics, this paper presents a {comprehensive benchmark for quantifying psychological constructs of LLMs}, encompassing psychological dimension identification, assessment dataset design, and assessment with results validation. Our work identifies five key psychological constructs -- personality, values, emotional intelligence, theory of mind, and self-efficacy -- assessed through a suite of 13 datasets featuring diverse scenarios and item types. We uncover…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Computational and Text Analysis Methods