Possible Principles for Aligned Structure Learning Agents

Lancelot Da Costa; Tom\'a\v{s} Gaven\v{c}iak; David Hyland; Mandana Samiei; Cristian Dragos-Manta; Candice Pattisapu; Adeel Razi; Karl Friston

arXiv:2410.00258·cs.AI·August 29, 2025

Possible Principles for Aligned Structure Learning Agents

Lancelot Da Costa, Tom\'a\v{s} Gaven\v{c}iak, David Hyland, Mandana Samiei, Cristian Dragos-Manta, Candice Pattisapu, Adeel Razi, Karl Friston

PDF

Open Access

TL;DR

This paper proposes principles for developing scalable aligned AI by enabling agents to learn models of the world and preferences through structure learning, combining ideas from mathematics, statistics, and cognitive science.

Contribution

It introduces a framework for aligned AI based on structure learning, core knowledge, and theory of mind, with illustrative principles like Asimov's Laws of Robotics.

Findings

01

Emphasizes the importance of core knowledge and information geometry in structure learning.

02

Proposes structural modules for learning naturalistic worlds.

03

Sketches alignment principles through cautious agent behavior inspired by Asimov's Laws.

Abstract

This paper offers a roadmap for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests upon enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents' world models; a problem that falls under structure learning (a.k.a. causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. 1) We discuss the essential role of core knowledge, information geometry and model reduction in structure learning, and suggest core structural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuzzy Logic and Control Systems · AI-based Problem Solving and Planning