NeurIPS Should Require Reproducibility Standards for Frontier AI Safety Claims

Varad Vishwarupe; Nigel Shadbolt; Marina Jirotka; Ivan Flechais

arXiv:2605.08192·cs.CY·May 12, 2026

NeurIPS Should Require Reproducibility Standards for Frontier AI Safety Claims

Varad Vishwarupe, Nigel Shadbolt, Marina Jirotka, Ivan Flechais

PDF

TL;DR

This paper advocates for NeurIPS to enforce reproducibility standards for AI safety claims, emphasizing transparency and evaluation integrity in high-stakes model deployment decisions.

Contribution

It proposes a three-tier disclosure framework and mandatory claim inventory to improve transparency and reproducibility of AI safety assertions.

Findings

01

Current safety claims are often non-reproducible, undermining trust.

02

Existing transparency scores are low, with inadequate disclosure of training data.

03

A structured disclosure framework can enhance evaluation and trustworthiness.

Abstract

Frontier AI safety claims - published assertions that a highly capable general-purpose model is below a threshold of concern, adequately mitigated, or suitable for release - increasingly shape model deployment, governance, and public trust. Yet the artefacts needed to evaluate them are routinely withheld, producing an evidential inversion: the most consequential claims in AI safety are often the least reproducible. This position paper argues that NeurIPS should require reproducibility standards for papers making such claims, treating non-reproducibility not as a transparency preference but as an evaluation-methodology failure. The 2026 International AI Safety Report [Bengio et al., 2026] concludes that reliable pre-deployment safety testing has become harder to conduct and that models now distinguish test from deployment contexts; the 2025 Foundation Model Transparency Index [Wan et…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.