Autonomous Alignment with Human Value on Altruism through Considerate Self-imagination and Theory of Mind
Haibo Tong, Enmeng Lu, Yinqian Sun, Zhengqiang Han, Chao Liu, Feifei, Zhao, Yi Zeng

TL;DR
This paper proposes a novel AI framework that incorporates Theory of Mind and self-imagination to enable autonomous, altruistic, and ethically aligned decision-making, inspired by human moral behavior and tested in complex rescue scenarios.
Contribution
It introduces a new approach for AI to autonomously align with human altruistic values using considerate self-imagination and Theory of Mind capabilities.
Findings
Agents can proactively anticipate risks and make altruistic decisions.
The framework effectively balances self-goals, altruism, and environmental safety.
Experimental scenarios demonstrate improved moral decision-making in AI.
Abstract
With the widespread application of Artificial Intelligence (AI) in human society, enabling AI to autonomously align with human values has become a pressing issue to ensure its sustainable development and benefit to humanity. One of the most important aspects of aligning with human values is the necessity for agents to autonomously make altruistic, safe, and ethical decisions, considering and caring for human well-being. Current AI extremely pursues absolute superiority in certain tasks, remaining indifferent to the surrounding environment and other agents, which has led to numerous safety risks. Altruistic behavior in human society originates from humans' capacity for empathizing others, known as Theory of Mind (ToM), combined with predictive imaginative interactions before taking action to produce thoughtful and altruistic behaviors. Inspired by this, we are committed to endow agents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersonality Traits and Psychology · Financial Literacy and Behavior
MethodsALIGN
