CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models
Linhao Yu, Yongqi Leng, Yufei Huang, Shang Wu, Haixin Liu, Xinmeng Ji,, Jiahui Zhao, Jinwang Song, Tingting Cui, Xiaoqing Cheng, Tao Liu, Deyi Xiong

TL;DR
CMoralEval is a comprehensive Chinese moral evaluation benchmark for large language models, combining diverse authentic scenarios and dilemmas rooted in Chinese culture to assess ethical responses.
Contribution
The paper introduces CMoralEval, a novel Chinese moral evaluation benchmark with AI-assisted annotation, diverse data sources, and a morality taxonomy aligned with cultural norms.
Findings
CMoralEval is challenging for current Chinese LLMs.
The dataset includes over 30,000 instances of moral scenarios.
The benchmark covers both explicit moral and dilemma scenarios.
Abstract
What a large language model (LLM) would respond in ethically relevant context? In this paper, we curate a large benchmark CMoralEval for morality evaluation of Chinese LLMs. The data sources of CMoralEval are two-fold: 1) a Chinese TV program discussing Chinese moral norms with stories from the society and 2) a collection of Chinese moral anomies from various newspapers and academic papers on morality. With these sources, we aim to create a moral evaluation dataset characterized by diversity and authenticity. We develop a morality taxonomy and a set of fundamental moral principles that are not only rooted in traditional Chinese culture but also consistent with contemporary societal norms. To facilitate efficient construction and annotation of instances in CMoralEval, we establish a platform with AI-assisted instance generation to streamline the annotation process. These help us curate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling
MethodsSparse Evolutionary Training
