Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning

Nigini Oliveira; Jasmine Li; Koosha Khalvati; Rodolfo Cortes Barragan; Katharina Reinecke; Andrew N. Meltzoff; and Rajesh P. N. Rao

arXiv:2312.17479·cs.AI·December 30, 2025·2 cites

Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning

Nigini Oliveira, Jasmine Li, Koosha Khalvati, Rodolfo Cortes Barragan, Katharina Reinecke, Andrew N. Meltzoff, and Rajesh P. N. Rao

PDF

Open Access

TL;DR

This paper demonstrates that AI agents can learn culturally-specific moral values through inverse reinforcement learning by observing human behavior, enabling them to adapt to different societal norms in real-time environments.

Contribution

It introduces a method for AI to implicitly acquire human cultural values using IRL, with experimental validation in virtual worlds showing successful learning and generalization.

Findings

01

AI agents learned altruistic behaviors from cultural group observations

02

The learned values generalized to new scenarios requiring moral judgments

03

First demonstration of AI acquiring and adapting to human cultural norms

Abstract

Constructing a universal moral code for artificial intelligence (AI) is difficult or even impossible, given that different human cultures have different definitions of morality and different societal norms. We therefore argue that the value system of an AI should be culturally attuned: just as a child raised in a particular culture learns the specific values and norms of that culture, we propose that an AI agent operating in a particular human community should acquire that community's moral, ethical, and cultural codes. How AI systems might acquire such codes from human observation and interaction has remained an open question. Here, we propose using inverse reinforcement learning (IRL) as a method for AI agents to acquire a culturally-attuned value system implicitly. We test our approach using an experimental paradigm in which AI agents use IRL to learn different reward functions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPsychology of Moral and Emotional Judgment