CPsyExam: A Chinese Benchmark for Evaluating Psychology using   Examinations

Jiahao Zhao; Jingwei Zhu; Minghuan Tan; Min Yang; Renhao Li; Di Yang,; Chenhao Zhang; Guancheng Ye; Chengming Li; Xiping Hu; Derek F. Wong

arXiv:2405.10212·cs.CL·December 11, 2024

CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

Jiahao Zhao, Jingwei Zhu, Minghuan Tan, Min Yang, Renhao Li, Di Yang,, Chenhao Zhang, Guancheng Ye, Chengming Li, Xiping Hu, Derek F. Wong

PDF

Open Access 1 Repo

TL;DR

CPsyExam is a new Chinese psychological benchmark derived from exam questions, designed to evaluate and improve large language models' understanding of psychology and case analysis through diverse, real-world scenario questions.

Contribution

This paper introduces CPsyExam, a novel benchmark from Chinese exam questions, to assess and enhance LLMs' psychological knowledge and case analysis capabilities.

Findings

01

CPsyExam effectively evaluates LLMs' psychological understanding.

02

Existing LLMs show varied performance on CPsyExam.

03

The benchmark enables detailed comparison of LLMs across different aspects.

Abstract

In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese language examinations. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. From the pool of 22k questions, we utilize 4k to create the benchmark that offers balanced coverage of subjects and incorporates a diverse range of case analysis techniques.Furthermore, we evaluate a range of existing large language models~(LLMs), spanning from open-sourced to API-based models. Our experiments and analysis demonstrate that CPsyExam serves as an effective benchmark for enhancing the understanding of psychology within LLMs and enables the comparison of LLMs across various granularities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CAS-SIAT-XinHai/CPsyExam
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational and Psychological Assessments