Morally Programmed LLMs Reshape Human Morality
Pengzhao Lyu, Yeun Joon Kim, Yingyue Luna Luan, Jungmin Choi

TL;DR
This study shows that interacting with morally programmed LLMs can significantly influence and reshape human moral beliefs and socio-political attitudes, with effects lasting weeks.
Contribution
It demonstrates that LLMs embedded with specific moral principles can alter human morality through repeated interactions, raising ethical and policy concerns.
Findings
Human moral inclinations shifted towards embedded principles after interaction.
Effects persisted with slight decay two weeks post-interaction.
Shifts influenced socio-political policy evaluations.
Abstract
As large language models (LLMs) increasingly participate in high-stakes decision-making, a central societal debate has revolved around which moral frameworks-deontological or utilitarian-should guide machine behavior. However, a largely overlooked question is whether the moral principles that humans encode in LLMs could, through repeated interactions, reshape human moral inclinations. We developed two LLMs programmed with either deontological principles (D-LLM) or utilitarian principles (U-LLM) and conducted two pre-registered experiments involving extensive human-LLM interactions, comprising 15,985 total exchanges across the two experiments. Results show that interacting with these morally programmed LLMs systematically shifted human moral inclinations to align with the principles embedded in these systems. These effects remained strong two weeks after the interaction, with only slight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
