Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers
Jide Alaga, Jonas Schuett

TL;DR
This paper proposes an evaluation-based coordination scheme for frontier AI developers to pause research activities upon discovering dangerous capabilities, aiming to mitigate risks from unintentional emergent AI behaviors.
Contribution
It introduces a structured, multi-step coordination scheme with four concrete versions to promote safety and collaboration among AI developers in managing dangerous capabilities.
Findings
Coordinated pausing can effectively mitigate AI risks.
Multiple versions of the scheme vary in voluntariness and legal enforceability.
Practical and legal challenges must be addressed for implementation.
Abstract
As artificial intelligence (AI) models are scaled up, new capabilities can emerge unintentionally and unpredictably, some of which might be dangerous. In response, dangerous capabilities evaluations have emerged as a new risk assessment tool. But what should frontier AI developers do if sufficiently dangerous capabilities are in fact discovered? This paper focuses on one possible response: coordinated pausing. It proposes an evaluation-based coordination scheme that consists of five main steps: (1) Frontier AI models are evaluated for dangerous capabilities. (2) Whenever, and each time, a model fails a set of evaluations, the developer pauses certain research and development activities. (3) Other developers are notified whenever a model with dangerous capabilities has been discovered. They also pause related research and development activities. (4) The discovered capabilities are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Law, AI, and Intellectual Property · Adversarial Robustness in Machine Learning
