Neurodivergent Influenceability as a Contingent Solution to the AI Alignment Problem

Alberto Hern\'andez-Espinosa; Felipe S. Abrah\~ao; Olaf Witkowski; Hector Zenil

arXiv:2505.02581·cs.AI·July 25, 2025

Neurodivergent Influenceability as a Contingent Solution to the AI Alignment Problem

Alberto Hern\'andez-Espinosa, Felipe S. Abrah\~ao, Olaf Witkowski, Hector Zenil

PDF

Open Access

TL;DR

This paper proposes that embracing AI misalignment as an inevitable feature can serve as a strategic approach to mitigate risks and promote human-aligned outcomes through a dynamic ecosystem of competing agents.

Contribution

It introduces a mathematical proof of the inevitability of AI misalignment in Turing-complete systems and explores how this can be leveraged to steer AI development towards safety.

Findings

01

Open models exhibit greater diversity in behavior.

02

Guardrails in proprietary models effectively control some AI behaviors.

03

Human and AI interventions have distinct impacts on AI behavior.

Abstract

The AI alignment problem, which focusses on ensuring that artificial intelligence (AI), including AGI and ASI, systems act according to human values, presents profound challenges. With the progression from narrow AI to Artificial General Intelligence (AGI) and Superintelligence, fears about control and existential risk have escalated. Here, we investigate whether embracing inevitable AI misalignment can be a contingent strategy to foster a dynamic ecosystem of competing agents as a viable path to steer them in more human-aligned trends and mitigate risks. We explore how misalignment may serve and should be promoted as a counterbalancing mechanism to team up with whichever agents are most aligned to human interests, ensuring that no single system dominates destructively. The main premise of our contribution is that misalignment is inevitable because full AI-human alignment is a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Science and Mapping