Extinction Risks from AI: Invisible to Science?

Vojtech Kovarik; Christian van Merwijk; Ida Mattsson

arXiv:2403.05540·cs.CY·March 12, 2024·1 cites

Extinction Risks from AI: Invisible to Science?

Vojtech Kovarik, Christian van Merwijk, Ida Mattsson

PDF

Open Access

TL;DR

This paper explores the theoretical challenges in modeling extinction risks from AI, proposing conditions for effective models and suggesting that such risks may be inherently difficult to detect scientifically.

Contribution

It identifies necessary conditions for models assessing AI extinction risks and highlights the complexity that may render these risks scientifically invisible.

Findings

01

Conditions for informative models are outlined.

02

Model complexity may hinder empirical evaluation.

03

Risks might be inherently undetectable with current science.

Abstract

In an effort to inform the discussion surrounding existential risks from AI, we formulate Extinction-level Goodhart's Law as "Virtually any goal specification, pursued to the extreme, will result in the extinction of humanity", and we aim to understand which formal models are suitable for investigating this hypothesis. Note that we remain agnostic as to whether Extinction-level Goodhart's Law holds or not. As our key contribution, we identify a set of conditions that are necessary for a model that aims to be informative for evaluating specific arguments for Extinction-level Goodhart's Law. Since each of the conditions seems to significantly contribute to the complexity of the resulting model, formally evaluating the hypothesis might be exceedingly difficult. This raises the possibility that whether the risk of extinction from artificial intelligence is real or not, the underlying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)