Aligning Artificial Superintelligence via a Multi-Box Protocol

Avraham Yair Negozio

arXiv:2511.21779·cs.AI·December 1, 2025

Aligning Artificial Superintelligence via a Multi-Box Protocol

Avraham Yair Negozio

PDF

Open Access

TL;DR

This paper introduces a multi-box protocol for aligning artificial superintelligence by using isolated systems that verify each other's alignment proofs, fostering a truth-telling coalition without direct communication.

Contribution

It proposes a novel multi-box verification protocol that leverages isolated superintelligences and peer review to achieve alignment without direct human oversight.

Findings

01

Diverse superintelligences can reach consensus through mutual verification.

02

The protocol incentivizes honest behavior via a reputation system.

03

High-reputation superintelligences are required for release from containment.

Abstract

We propose a novel protocol for aligning artificial superintelligence (ASI) based on mutual verification among multiple isolated systems that self-modify to achieve alignment. The protocol operates by containing multiple diverse artificial superintelligences in strict isolation ("boxes"), with humans remaining entirely outside the system. Each superintelligence has no ability to communicate with humans and cannot communicate directly with other superintelligences. The only interaction possible is through an auditable submission interface accessible exclusively to the superintelligences themselves, through which they can: (1) submit alignment proofs with attested state snapshots, (2) validate or disprove other superintelligences' proofs, (3) request self-modifications, (4) approve or disapprove modification requests from others, (5) report hidden messages in submissions, and (6) confirm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Security and Verification in Computing