Declare and Justify: Explicit assumptions in AI evaluations are   necessary for effective regulation

Peter Barnett; Lisa Thiergart

arXiv:2411.12820·cs.AI·November 21, 2024

Declare and Justify: Explicit assumptions in AI evaluations are necessary for effective regulation

Peter Barnett, Lisa Thiergart

PDF

Open Access

TL;DR

This paper advocates for requiring AI developers to explicitly state and justify core assumptions in evaluations to improve transparency and safety regulation of AI systems.

Contribution

It introduces the idea that explicit assumptions in AI evaluations are essential for effective regulation and proposes a framework for requiring justification of these assumptions.

Findings

01

Many core assumptions in AI evaluations lack proper justification.

02

Explicitly stating assumptions can improve transparency and safety.

03

Regulatory measures could halt AI development if assumptions are unjustified.

Abstract

As AI systems advance, AI evaluations are becoming an important pillar of regulations for ensuring safety. We argue that such regulation should require developers to explicitly identify and justify key underlying assumptions about evaluations as part of their case for safety. We identify core assumptions in AI evaluations (both for evaluating existing models and forecasting future models), such as comprehensive threat modeling, proxy task validity, and adequate capability elicitation. Many of these assumptions cannot currently be well justified. If regulation is to be based on evaluations, it should require that AI development be halted if evaluations demonstrate unacceptable danger or if these assumptions are inadequately justified. Our presented approach aims to enhance transparency in AI development, offering a practical path towards more effective governance of advanced AI systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI