FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and   LLM Agents Amid Ethical Dilemmas

Yu Lei; Hao Liu; Chengxing Xie; Songjia Liu; Zhiyu Yin; Canyu Chen,; Guohao Li; Philip Torr; Zhen Wu

arXiv:2410.10398·cs.CE·October 18, 2024·3 cites

FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and LLM Agents Amid Ethical Dilemmas

Yu Lei, Hao Liu, Chengxing Xie, Songjia Liu, Zhiyu Yin, Canyu Chen,, Guohao Li, Philip Torr, Zhen Wu

PDF

Open Access

TL;DR

This paper introduces FairMindSim, a simulation framework for aligning LLM and human moral behavior in ethical dilemmas, incorporating sociological insights and a new belief-reward model.

Contribution

It presents FairMindSim and the BREM model to enhance understanding of moral alignment in AI and humans during ethical conflicts.

Findings

01

GPT-4o shows stronger social justice tendencies

02

Humans exhibit a richer emotional spectrum

03

Emotions influence moral decision-making

Abstract

AI alignment is a pivotal issue concerning AI control and safety. It should consider not only value-neutral human preferences but also moral and ethical considerations. In this study, we introduced FairMindSim, which simulates the moral dilemma through a series of unfair scenarios. We used LLM agents to simulate human behavior, ensuring alignment across various stages. To explore the various socioeconomic motivations, which we refer to as beliefs, that drive both humans and LLM agents as bystanders to intervene in unjust situations involving others, and how these beliefs interact to influence individual behavior, we incorporated knowledge from relevant sociological fields and proposed the Belief-Reward Alignment Behavior Evolution Model (BREM) based on the recursive reward model (RRM). Our findings indicate that, behaviorally, GPT-4o exhibits a stronger sense of social justice, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Psychology of Moral and Emotional Judgment · Adversarial Robustness in Machine Learning