AI Psychometrics: Evaluating the Psychological Reasoning of Large Language Models with Psychometric Validities
Yibai Li, Xiaolin Lin, Zhenghui Sha, Zhiye Jin, Xiaobing Li

TL;DR
This paper applies psychometric methods to evaluate the psychological reasoning and validity of large language models, revealing that higher-performing models show better psychometric validity, thus supporting AI Psychometrics as a useful evaluation approach.
Contribution
It introduces AI Psychometrics as a novel framework for evaluating LLMs' psychological reasoning using psychometric validity criteria.
Findings
All models met validity criteria
GPT-4 and LLaMA-3 showed superior validity
Higher model performance correlates with better psychometric validity
Abstract
The immense number of parameters and deep neural networks make large language models (LLMs) rival the complexity of human brains, which also makes them opaque ``black box'' systems that are challenging to evaluate and interpret. AI Psychometrics is an emerging field that aims to tackle these challenges by applying psychometric methodologies to evaluate and interpret the psychological traits and processes of artificial intelligence (AI) systems. This paper investigates the application of AI Psychometrics to evaluate the psychological reasoning and overall psychometric validity of four prominent LLMs: GPT-3.5, GPT-4, LLaMA-2, and LLaMA-3. Using the Technology Acceptance Model (TAM), we examined convergent, discriminant, predictive, and external validity across these models. Our findings reveal that the responses from all these models generally met all validity criteria. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Digital Mental Health Interventions
