ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs
Rohan Subramanian Thomas, Shikhar Shiromani, Abdullah Chaudhry, Ruizhe Li, Vasu Sharma, Kevin Zhu, Sunishchal Dev

TL;DR
ProMoral-Bench offers a comprehensive benchmark to evaluate prompting strategies for moral reasoning and safety in large language models, highlighting the effectiveness of exemplar-guided prompts over complex reasoning.
Contribution
Introduces ProMoral-Bench, a unified benchmark with a new robustness test and UMSS metric, enabling systematic comparison of prompting paradigms across multiple LLMs.
Findings
Exemplar-guided prompts outperform complex reasoning approaches.
Few-shot prompts improve moral stability and jailbreak resistance.
Compact prompts achieve higher UMSS scores with lower token costs.
Abstract
Prompt design significantly impacts the moral competence and safety alignment of large language models (LLMs), yet empirical comparisons remain fragmented across datasets and models.We introduce ProMoral-Bench, a unified benchmark evaluating 11 prompting paradigms across four LLM families. Using ETHICS, Scruples, WildJailbreak, and our new robustness test, ETHICS-Contrast, we measure performance via our proposed Unified Moral Safety Score (UMSS), a metric balancing accuracy and safety. Our results show that compact, exemplar-guided scaffolds outperform complex multi-stage reasoning, providing higher UMSS scores and greater robustness at a lower token cost. While multi-turn reasoning proves fragile under perturbations, few-shot exemplars consistently enhance moral stability and jailbreak resistance. ProMoral-Bench establishes a standardized framework for principled, cost-effective prompt…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
