Analyzing the Ethical Logic of Six Large Language Models

W. Russell Neuman; Chad Coleman; and Manan Shah

arXiv:2501.08951·cs.AI·January 16, 2025·5 cites

Analyzing the Ethical Logic of Six Large Language Models

W. Russell Neuman, Chad Coleman, and Manan Shah

PDF

Open Access

TL;DR

This study investigates the ethical reasoning of six large language models, revealing a predominantly consequentialist approach with nuanced differences influenced by training, and highlighting their sophisticated, graduate-level moral discourse capabilities.

Contribution

It introduces an explainability framework to analyze LLMs' ethical reasoning, revealing their rationalist, consequentialist tendencies and variations due to fine-tuning.

Findings

01

Models exhibit convergent ethical logic emphasizing harm minimization.

02

Significant differences in ethical reasoning reflect fine-tuning and training variations.

03

Models demonstrate high-level moral reasoning comparable to graduate discourse.

Abstract

This study examines the ethical reasoning of six prominent generative large language models: OpenAI GPT-4o, Meta LLaMA 3.1, Perplexity, Anthropic Claude 3.5 Sonnet, Google Gemini, and Mistral 7B. The research explores how these models articulate and apply ethical logic, particularly in response to moral dilemmas such as the Trolley Problem, and Heinz Dilemma. Departing from traditional alignment studies, the study adopts an explainability-transparency framework, prompting models to explain their ethical reasoning. This approach is analyzed through three established ethical typologies: the consequentialist-deontological analytic, Moral Foundations Theory, and the Kohlberg Stages of Moral Development Model. Findings reveal that LLMs exhibit largely convergent ethical logic, marked by a rationalist, consequentialist emphasis, with decisions often prioritizing harm minimization and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

MethodsLLaMA