Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement

Haoran Ye; Jing Jin; Yuhang Xie; Xin Zhang; Guojie Song

arXiv:2505.08245·cs.CL·March 12, 2026

Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement

Haoran Ye, Jing Jin, Yuhang Xie, Xin Zhang, Guojie Song

PDF

1 Repo

TL;DR

This paper reviews the emerging field of LLM psychometrics, which applies psychological measurement principles to evaluate and improve large language models, addressing challenges of human-like understanding and human-centered evaluation.

Contribution

It introduces a structured framework for LLM psychometrics, synthesizing interdisciplinary methods and providing actionable insights for future evaluation paradigms.

Findings

01

Benchmarking principles are systematically shaped.

02

Evaluation scope is broadened beyond traditional metrics.

03

Methodologies are refined for better validation.

Abstract

The advancement of large language models (LLMs) has outpaced traditional evaluation methodologies. This progress presents novel challenges, such as measuring human-like psychological constructs, moving beyond static and task-specific benchmarks, and establishing human-centered evaluation. These challenges intersect with psychometrics, the science of quantifying the intangible aspects of human psychology, such as personality, values, and intelligence. This review paper introduces and synthesizes the emerging interdisciplinary field of LLM Psychometrics, which leverages psychometric instruments, theories, and principles to evaluate, understand, and enhance LLMs. The reviewed literature systematically shapes benchmarking principles, broadens evaluation scopes, refines methodologies, validates results, and advances LLM capabilities. Diverse perspectives are integrated to provide a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

valuebyte-ai/awesome-llm-psychometrics
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsALIGN