"Pull or Not to Pull?'': Investigating Moral Biases in Leading Large Language Models Across Ethical Dilemmas

Junchen Ding; Penghao Jiang; Zihao Xu; Ziqi Ding; Yichen Zhu; Jiaojiao Jiang; Yuekang Li

arXiv:2508.07284·cs.CL·August 12, 2025·2 cites

"Pull or Not to Pull?'': Investigating Moral Biases in Leading Large Language Models Across Ethical Dilemmas

Junchen Ding, Penghao Jiang, Zihao Xu, Ziqi Ding, Yichen Zhu, Jiaojiao Jiang, Yuekang Li

PDF

Open Access

TL;DR

This paper empirically evaluates 14 large language models across diverse ethical dilemmas, revealing variability in moral reasoning, decisiveness, and alignment with human judgments, highlighting the importance of moral prompting as a diagnostic tool.

Contribution

It introduces a comprehensive factorial prompting protocol to analyze LLMs' moral reasoning across multiple ethical frameworks and identifies key patterns and zones of alignment and divergence.

Findings

01

Reasoning-enabled models show greater decisiveness and structured justifications.

02

Models achieve high alignment in altruistic, fairness, and virtue ethics frames.

03

Divergence observed in kinship, legality, and self-interest frames.

Abstract

As large language models (LLMs) increasingly mediate ethically sensitive decisions, understanding their moral reasoning processes becomes imperative. This study presents a comprehensive empirical evaluation of 14 leading LLMs, both reasoning enabled and general purpose, across 27 diverse trolley problem scenarios, framed by ten moral philosophies, including utilitarianism, deontology, and altruism. Using a factorial prompting protocol, we elicited 3,780 binary decisions and natural language justifications, enabling analysis along axes of decisional assertiveness, explanation answer consistency, public moral alignment, and sensitivity to ethically irrelevant cues. Our findings reveal significant variability across ethical frames and model types: reasoning enhanced models demonstrate greater decisiveness and structured justifications, yet do not always align better with human consensus.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Topic Modeling