CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language   Models

Linhao Yu; Yongqi Leng; Yufei Huang; Shang Wu; Haixin Liu; Xinmeng Ji,; Jiahui Zhao; Jinwang Song; Tingting Cui; Xiaoqing Cheng; Tao Liu; Deyi Xiong

arXiv:2408.09819·cs.CL·August 20, 2024

CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models

Linhao Yu, Yongqi Leng, Yufei Huang, Shang Wu, Haixin Liu, Xinmeng Ji,, Jiahui Zhao, Jinwang Song, Tingting Cui, Xiaoqing Cheng, Tao Liu, Deyi Xiong

PDF

Open Access 1 Repo

TL;DR

CMoralEval is a comprehensive Chinese moral evaluation benchmark for large language models, combining diverse authentic scenarios and dilemmas rooted in Chinese culture to assess ethical responses.

Contribution

The paper introduces CMoralEval, a novel Chinese moral evaluation benchmark with AI-assisted annotation, diverse data sources, and a morality taxonomy aligned with cultural norms.

Findings

01

CMoralEval is challenging for current Chinese LLMs.

02

The dataset includes over 30,000 instances of moral scenarios.

03

The benchmark covers both explicit moral and dilemma scenarios.

Abstract

What a large language model (LLM) would respond in ethically relevant context? In this paper, we curate a large benchmark CMoralEval for morality evaluation of Chinese LLMs. The data sources of CMoralEval are two-fold: 1) a Chinese TV program discussing Chinese moral norms with stories from the society and 2) a collection of Chinese moral anomies from various newspapers and academic papers on morality. With these sources, we aim to create a moral evaluation dataset characterized by diversity and authenticity. We develop a morality taxonomy and a set of fundamental moral principles that are not only rooted in traditional Chinese culture but also consistent with contemporary societal norms. To facilitate efficient construction and annotation of instances in CMoralEval, we establish a platform with AI-assisted instance generation to streamline the annotation process. These help us curate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tjunlp-lab/cmoraleval
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Topic Modeling

MethodsSparse Evolutionary Training