Doing the right thing for the right reason: Evaluating artificial moral   cognition by probing cost insensitivity

Yiran Mao; Madeline G. Reinecke; Markus Kunesch; Edgar A.; Du\'e\~nez-Guzm\'an; Ramona Comanescu; Julia Haas; Joel Z. Leibo

arXiv:2305.18269·cs.AI·May 30, 2023·1 cites

Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity

Yiran Mao, Madeline G. Reinecke, Markus Kunesch, Edgar A., Du\'e\~nez-Guzm\'an, Ramona Comanescu, Julia Haas, Joel Z. Leibo

PDF

Open Access

TL;DR

This paper proposes a behavior-based method to evaluate artificial moral cognition by measuring agents' sensitivity to costs, revealing differences in moral motivation between agents with different training objectives.

Contribution

It introduces a novel evaluation approach for artificial moral cognition based on cost sensitivity, applicable to both AI agents and humans.

Findings

01

Agents with other-regarding preferences show less cost sensitivity in helping behavior.

02

Cost insensitivity correlates with morally-motivated behavior.

03

The method enables comparison of moral cognition across agents.

Abstract

Is it possible to evaluate the moral cognition of complex artificial agents? In this work, we take a look at one aspect of morality: `doing the right thing for the right reasons.' We propose a behavior-based analysis of artificial moral cognition which could also be applied to humans to facilitate like-for-like comparison. Morally-motivated behavior should persist despite mounting cost; by measuring an agent's sensitivity to this cost, we gain deeper insight into underlying motivations. We apply this evaluation to a particular set of deep reinforcement learning agents, trained by memory-based meta-reinforcement learning. Our results indicate that agents trained with a reward function that includes other-regarding preferences perform helping behavior in a way that is less sensitive to increasing cost than agents trained with more self-interested preferences.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPsychology of Moral and Emotional Judgment · Experimental Behavioral Economics Studies · Adversarial Robustness in Machine Learning