ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical   Insights from Large Language Models

Yuxi Sun; Wei Gao; Jing Ma; Hongzhan Lin; Ziyang Luo; Wenxuan Zhang

arXiv:2412.12848·cs.CY·April 10, 2025

ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models

Yuxi Sun, Wei Gao, Jing Ma, Hongzhan Lin, Ziyang Luo, Wenxuan Zhang

PDF

Open Access

TL;DR

ClarityEthic is a novel approach that uses contrastive learning with large language models to identify relevant social norms and improve the accuracy and explainability of moral judgments.

Contribution

It introduces a new method leveraging LLM reasoning and contrastive learning to select appropriate norms for moral decision-making, enhancing interpretability and performance.

Findings

01

Outperforms state-of-the-art in moral judgment accuracy

02

Provides plausible social norm explanations for judgments

03

Human evaluations confirm explanation quality

Abstract

With the rise and widespread use of Large Language Models (LLMs), ensuring their safety is crucial to prevent harm to humans and promote ethical behaviors. However, directly assessing value valence (i.e., support or oppose) by leveraging large-scale data training is untrustworthy and inexplainable. We assume that emulating humans to rely on social norms to make moral decisions can help LLMs understand and predict moral judgment. However, capturing human values remains a challenge, as multiple related norms might conflict in specific contexts. Consider norms that are upheld by the majority and promote the well-being of society are more likely to be accepted and widely adopted (e.g., "don't cheat,"). Therefore, it is essential for LLM to identify the appropriate norms for a given scenario before making moral decisions. To this end, we introduce a novel moral judgment approach called…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)

MethodsContrastive Learning