Robust AI Security and Alignment: A Sisyphean Endeavor?
Apostol Vassilev

TL;DR
This paper explores fundamental information-theoretic limits on AI security and alignment, extending Gödel's incompleteness theorem to AI, and discusses practical approaches and broader cognitive implications.
Contribution
It introduces a novel extension of Gödel's incompleteness theorem to AI, establishing theoretical limitations on robustness and alignment.
Findings
Identifies information-theoretic constraints on AI robustness
Provides practical strategies to address these limitations
Proves broader cognitive reasoning limitations of AI systems
Abstract
This manuscript establishes information-theoretic limitations for robustness of AI security and alignment by extending G\"odel's incompleteness theorem to AI. Knowing these limitations and preparing for the challenges they bring is critically important for the responsible adoption of the AI technology. Practical approaches to dealing with these challenges are provided as well. Broader implications for cognitive reasoning limitations of AI systems are also proven.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
