Evaluating AI Providers' Frontier Safety Frameworks
Lily Stelling, Malcolm Murray, Bruno Galizzi, Max Schaffelder, Sim\'eon Campos, Henry Papadatos

TL;DR
This paper evaluates 12 frontier AI safety frameworks from major companies, revealing generally limited commitments and suggesting room for improvement within current practices to enhance accountability and risk management.
Contribution
It introduces a systematic assessment method for AI safety frameworks using 65 criteria across key risk management dimensions, highlighting gaps and potential for better practices.
Findings
Scores range from 8% to 34%, median 18%.
Many frameworks lack detail and are under-specified.
Adopting leading practices could triple current scores.
Abstract
Following the AI Seoul Summit in 2024, twelve AI companies published frontier AI safety frameworks (Frameworks) outlining their approaches to managing catastrophic risks from advanced AI systems. Emerging legislation increasingly treats these Frameworks as external accountability mechanisms, incorporating them into reporting requirements. But what do the Frameworks actually commit each company to do? This study assesses 12 Frameworks, using 65 weighted criteria, across four dimensions: risk identification, risk analysis \& evaluation, risk treatment, and risk governance. Our criteria adapt established risk management principles from other high-risk industries (e.g. aviation, nuclear power) to the frontier AI context, following Campos et al. (2025). Overall scores range from 34% (Anthropic) to 8% (Cohere), with a median of 18%. Many aspects are missing or under-specified. These low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
