Amazon Nova AI Challenge -- Trusted AI: Advancing secure, AI-assisted software development

Sattvik Sahai; Prasoon Goyal; Michael Johnston; Anna Gottardi; Yao Lu; Lucy Hu; Luke Dai; Shaohua Liu; Samyuth Sagi; Hangjie Shi; Desheng Zhang; Lavina Vaz; Leslie Ball; Maureen Murray; Rahul Gupta; and Shankar Ananthakrishna

arXiv:2508.10108·cs.AI·August 15, 2025

Amazon Nova AI Challenge -- Trusted AI: Advancing secure, AI-assisted software development

Sattvik Sahai, Prasoon Goyal, Michael Johnston, Anna Gottardi, Yao Lu, Lucy Hu, Luke Dai, Shaohua Liu, Samyuth Sagi, Hangjie Shi, Desheng Zhang, Lavina Vaz, Leslie Ball, Maureen Murray, Rahul Gupta, and Shankar Ananthakrishna

PDF

TL;DR

This paper discusses the Amazon Nova AI Challenge focused on advancing secure AI in software development, highlighting novel safety alignment techniques, adversarial testing, and collaborative efforts to improve AI safety standards.

Contribution

It introduces new methods for safety alignment, robust guardrails, and adversarial testing in AI-assisted software development, supported by a comprehensive challenge framework.

Findings

01

Development of state-of-the-art safety techniques

02

Creation of a custom baseline AI model

03

Successful adversarial safety evaluations

Abstract

AI systems for software development are rapidly gaining prominence, yet significant challenges remain in ensuring their safety. To address this, Amazon launched the Trusted AI track of the Amazon Nova AI Challenge, a global competition among 10 university teams to drive advances in secure AI. In the challenge, five teams focus on developing automated red teaming bots, while the other five create safe AI assistants. This challenge provides teams with a unique platform to evaluate automated red-teaming and safety alignment methods through head-to-head adversarial tournaments where red teams have multi-turn conversations with the competing AI coding assistants to test their safety alignment. Along with this, the challenge provides teams with a feed of high quality annotated data to fuel iterative improvement. Throughout the challenge, teams developed state-of-the-art techniques,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.