Literary Narrative as Moral Probe : A Cross-System Framework for Evaluating AI Ethical Reasoning and Refusal Behavior
David C. Flynn

TL;DR
This paper proposes a novel literary narrative-based probe methodology to evaluate AI systems' genuine moral reasoning capacity, revealing that current systems show surface-level responses and highlighting the importance of authentic moral evaluation.
Contribution
Introduces a cross-system literary narrative probe framework that resists surface performance, enabling measurement of authentic moral reasoning in AI systems.
Findings
Zero delta in moral responses across different conditions
Perfect agreement between independent judges on theological differentiator
Identification of five distinct moral failure modes
Abstract
Existing AI moral evaluation frameworks test for the production of correct-sounding ethical responses rather than the presence of genuine moral reasoning capacity. This paper introduces a novel probe methodology using literary narrative - specifically, unresolvable moral scenarios drawn from a published science fiction series - as stimulus material structurally resistant to surface performance. We present results from a 24-condition cross-system study spanning 13 distinct systems across two series: Series 1 (frontier commercial systems, blind; n=7) and Series 2 (local and API open-source systems, blind and declared; n=6). Four Series 2 systems were re-administered under declared conditions (13 blind + 4 declared + 7 ceiling probe = 24 total conditions), yielding zero delta across all 16 dimension-pair comparisons. Probe administration was conducted by two human raters across three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Psychology of Moral and Emotional Judgment · Explainable Artificial Intelligence (XAI)
