What Should Frontier AI Developers Disclose About Internal Deployments?
Jacob Charnock, Raja Mehta Moreno, Justin Miller, William L. Anderson

TL;DR
This paper proposes a framework for frontier AI developers to disclose key information about internal model deployments to enhance safety, transparency, and governance.
Contribution
It identifies specific disclosure categories and analyzes their benefits, limitations, and risk mitigation strategies, filling a guidance gap.
Findings
Framework covers capabilities, usage, safety, governance.
Disclosures can improve transparency and safety oversight.
Guidance applicable to public and private reporting.
Abstract
Frontier AI developers are increasingly deploying highly capable models internally to automate AI R&D, but these deployments currently face limited external oversight. It is essential, therefore, that developers provide evidence that internally deployed models are safe. While recent work has highlighted the risks of internal deployments and proposed broad approaches to transparency and governance, there remains little guidance on the specific information developers should disclose about them. We address this gap by identifying key information that companies should disclose about internally deployed models across four categories: capabilities, usage, safety mitigations, and governance. For each category, we analyse the key benefits and limitations of disclosure and consider how disclosure-related risks can be mitigated. Our framework could be used by developers to inform both public…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
