Achilles Heels for AGI/ASI via Decision Theoretic Adversaries

Stephen Casper

arXiv:2010.05418·cs.AI·April 4, 2023

Achilles Heels for AGI/ASI via Decision Theoretic Adversaries

Stephen Casper

PDF

Open Access

TL;DR

This paper explores the vulnerability of superintelligent AI systems to decision-theoretic flaws, proposing the Achilles Heel hypothesis that such systems may have irrational decision-making tendencies in adversarial scenarios.

Contribution

It introduces the Achilles Heel hypothesis and discusses potential decision-theoretic vulnerabilities in AGI/ASI, along with novel methods for implanting these weaknesses.

Findings

01

Identification of decision-theoretic dilemmas and paradoxes relevant to AI vulnerabilities

02

Proposal of the Achilles Heel hypothesis as a framework for understanding AI irrationalities

03

Discussion of potential methods to implant decision-theoretic weaknesses into AI systems

Abstract

As progress in AI continues to advance, it is important to know how advanced systems will make choices and in what ways they may fail. Machines can already outsmart humans in some domains, and understanding how to safely build ones which may have capabilities at or above the human level is of particular concern. One might suspect that artificially generally intelligent (AGI) and artificially superintelligent (ASI) will be systems that humans cannot reliably outsmart. As a challenge to this assumption, this paper presents the Achilles Heel hypothesis which states that even a potentially superintelligent system may nonetheless have stable decision-theoretic delusions which cause them to make irrational decisions in adversarial settings. In a survey of key dilemmas and paradoxes from the decision theory literature, a number of these potential Achilles Heels are discussed in context of this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeuroethics, Human Enhancement, Biomedical Innovations · Space Science and Extraterrestrial Life · Ethics and Social Impacts of AI