Integrating Reason-Based Moral Decision-Making in the Reinforcement Learning Architecture

Lisa Dargasz

arXiv:2507.15895·cs.AI·July 23, 2025

Integrating Reason-Based Moral Decision-Making in the Reinforcement Learning Architecture

Lisa Dargasz

PDF

Open Access

TL;DR

This paper proposes a new reinforcement learning architecture extension, called reason-based artificial moral agents (RBAMAs), enabling autonomous agents to make ethically justified decisions through normative reasoning.

Contribution

It introduces the concept of RBAMAs, integrating normative reasoning into reinforcement learning to enhance ethical decision-making in autonomous agents.

Findings

01

First implementation of RBAMA demonstrated in initial experiments

02

RBAMAs adapt behavior to moral obligations

03

Framework shows potential for ethical autonomous systems

Abstract

Reinforcement Learning is a machine learning methodology that has demonstrated strong performance across a variety of tasks. In particular, it plays a central role in the development of artificial autonomous agents. As these agents become increasingly capable, market readiness is rapidly approaching, which means those agents, for example taking the form of humanoid robots or autonomous cars, are poised to transition from laboratory prototypes to autonomous operation in real-world environments. This transition raises concerns leading to specific requirements for these systems - among them, the requirement that they are designed to behave ethically. Crucially, research directed toward building agents that fulfill the requirement to behave ethically - referred to as artificial moral agents(AMAs) - has to address a range of challenges at the intersection of computer science and philosophy.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPsychology of Moral and Emotional Judgment · Adversarial Robustness in Machine Learning