The Moral Foundations Weibo Corpus
Renjie Cao, Miaoyan Hu, Jiahan Wei, Baha Ihnaini

TL;DR
This paper introduces a large, annotated Chinese Weibo comment corpus focused on moral sentiments, enabling better NLP analysis of morality in social media, with baseline model evaluations included.
Contribution
It presents the first extensive Chinese moral sentiment corpus with manual annotations and reliability assessments, plus baseline evaluations of large language models for moral classification.
Findings
High inter-annotator agreement demonstrated reliability
Large language models show promising performance in moral sentiment classification
The corpus enables nuanced analysis of Chinese social media morality expressions
Abstract
Moral sentiments expressed in natural language significantly influence both online and offline environments, shaping behavioral styles and interaction patterns, including social media selfpresentation, cyberbullying, adherence to social norms, and ethical decision-making. To effectively measure moral sentiments in natural language processing texts, it is crucial to utilize large, annotated datasets that provide nuanced understanding for accurate analysis and modeltraining. However, existing corpora, while valuable, often face linguistic limitations. To address this gap in the Chinese language domain,we introduce the Moral Foundation Weibo Corpus. This corpus consists of 25,671 Chinese comments on Weibo, encompassing six diverse topic areas. Each comment is manually annotated by at least three systematically trained annotators based on ten moral categories derived from a grounded theory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
