Do Language Models Understand Morality? Towards a Robust Detection of   Moral Content

Luana Bulla; Aldo Gangemi; Misael Mongiov\`i

arXiv:2406.04143·cs.CL·June 7, 2024

Do Language Models Understand Morality? Towards a Robust Detection of Moral Content

Luana Bulla, Aldo Gangemi, Misael Mongiov\`i

PDF

Open Access 1 Repo

TL;DR

This paper explores the use of large language models, especially GPT 3.5, as zero-shot classifiers for detecting moral values in text, comparing their performance with supervised models across different domains.

Contribution

It introduces the Davinci model as a state-of-the-art zero-shot moral value detector and provides a comprehensive comparison with supervised approaches for cross-domain robustness.

Findings

01

GPT 3.5 achieves competitive results in moral detection without training data

02

Unsupervised NLI-based models perform comparably to larger models

03

Supervised models struggle with cross-domain generalization

Abstract

The task of detecting moral values in text has significant implications in various fields, including natural language processing, social sciences, and ethical decision-making. Previously proposed supervised models often suffer from overfitting, leading to hyper-specialized moral classifiers that struggle to perform well on data from different domains. To address this issue, we introduce novel systems that leverage abstract concepts and common-sense knowledge acquired from Large Language Models and Natural Language Inference models during previous stages of training on multiple data sources. By doing so, we aim to develop versatile and robust methods for detecting moral values in real-world scenarios. Our approach uses the GPT 3.5 model as a zero-shot ready-made unsupervised multi-label classifier for moral values detection, eliminating the need for explicit training on labeled data. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LuanaBulla/Detection-of-Morality-in-Text
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPsychology of Moral and Emotional Judgment · Hate Speech and Cyberbullying Detection

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Adam · Residual Connection · Multi-Head Attention · Dropout · Dense Connections · Cosine Annealing