Risk Reporting for Developers' Internal AI Model Use
Oscar Delaney, Sambhav Maheshwari, Joe O'Brien, Theo Bearman, Oliver Guest

TL;DR
This paper proposes a standardized framework for risk reporting on internal AI model use, addressing safety, legal compliance, and risk management for frontier AI companies before public deployment.
Contribution
It introduces a harmonized risk reporting standard tailored for internal AI use, aligning with multiple regulatory frameworks and focusing on autonomous and insider threat vectors.
Findings
Provides a structured risk report template for internal AI deployment.
Aligns risk reporting with legal frameworks like SB 53, RAISE Act, and EU guidelines.
Emphasizes the importance of regular, detailed risk assessments before deploying powerful models.
Abstract
Frontier AI companies first deploy their most advanced models internally, for weeks or months of safety testing, evaluation, and iteration, before a possible public release. For example, Anthropic recently developed a new class of model with advanced cyberoffense-relevant capabilities, Mythos Preview, which was available internally for at least six weeks before it was publicly announced. This internal use creates risks that external deployment frameworks may fail to address. Legal frameworks, notably California's Transparency in Frontier Artificial Intelligence Act (SB 53), New York's Responsible AI Safety And Education (RAISE) Act, and the EU's General-Purpose AI Code of Practice, all discuss risks from internal AI use. They require frontier developers to make and implement plans for how to manage risks from internal use, and to produce internal use risk reports describing their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
