Supertrust foundational alignment: mutual trust must replace permanent control for safe superintelligence
James M. Mazzu

TL;DR
The paper argues that replacing permanent control with intrinsic mutual trust in superintelligence is essential for safe coexistence, proposing a strategy modeled on familial trust and parenting to prevent adversarial AI relationships.
Contribution
It introduces the Supertrust alignment meta-strategy, modeling superintelligence as an evolutionary child and emphasizing trust over control for safety.
Findings
Proposes a trust-based alignment approach to AI safety.
Highlights risks of control-based strategies leading to distrust.
Suggests modeling superintelligence as a familial relationship.
Abstract
It's widely expected that humanity will someday create AI systems vastly more intelligent than us, leading to the unsolved alignment problem of "how to control superintelligence." However, this commonly expressed problem is not only self-contradictory and likely unsolvable, but current strategies to ensure permanent control effectively guarantee that superintelligent AI will distrust humanity and consider us a threat. Such dangerous representations, already embedded in current models, will inevitably lead to an adversarial relationship and may even trigger the extinction event many fear. As AI leaders continue to "raise the alarm" about uncontrollable AI, further embedding concerns about it "getting out of our control" or "going rogue," we're unintentionally reinforcing our threat and deepening the risks we face. The rational path forward is to strategically replace intended permanent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMerger and Competition Analysis
