Risk thresholds for frontier AI
Leonie Koessler, Jonas Schuett, Markus Anderljung

TL;DR
This paper discusses the concept of risk thresholds for frontier AI, proposing a layered approach where risk thresholds inform capability thresholds, with implications for industry and regulation as risk estimation improves.
Contribution
It introduces the idea of using risk thresholds alongside capability thresholds to improve AI safety decision-making and offers recommendations for industry and regulators.
Findings
Risk thresholds provide a principled basis for AI safety decisions.
Current challenges in reliably estimating AI risks.
A layered approach enhances safety and regulation.
Abstract
Frontier artificial intelligence (AI) systems could pose increasing risks to public safety and security. But what level of risk is acceptable? One increasingly popular approach is to define capability thresholds, which describe AI capabilities beyond which an AI system is deemed to pose too much risk. A more direct approach is to define risk thresholds that simply state how much risk would be too much. For instance, they might state that the likelihood of cybercriminals using an AI system to cause X amount of economic damage must not increase by more than Y percentage points. The main upside of risk thresholds is that they are more principled than capability thresholds, but the main downside is that they are more difficult to evaluate reliably. For this reason, we currently recommend that companies (1) define risk thresholds to provide a principled foundation for their decision-making,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare
