Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework

Satyapriya Krishna; Ninareh Mehrabi; Abhinav Mohanty; Matteo Memelli; Vincent Ponzo; Payal Motwani; Rahul Gupta

arXiv:2507.06260·cs.CR·July 10, 2025

Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework

Satyapriya Krishna, Ninareh Mehrabi, Abhinav Mohanty, Matteo Memelli, Vincent Ponzo, Payal Motwani, Rahul Gupta

PDF

Open Access

TL;DR

This paper presents a comprehensive safety evaluation of Amazon's Nova Premier model across high-risk domains, demonstrating its safety for public release based on rigorous benchmarks, red-teaming, and uplift studies.

Contribution

It is the first detailed assessment of Nova Premier's risk profile under the Frontier Model Safety Framework, establishing safety benchmarks for large multimodal models.

Findings

01

Nova Premier is deemed safe for public release.

02

Evaluation methodology combines automated benchmarks and expert red-teaming.

03

Ongoing safety improvements are planned as new risks emerge.

Abstract

Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Safety Systems Engineering in Autonomy · Risk and Safety Analysis