Security practices in AI development

Petr Spelda; Vit Stritecky

arXiv:2507.21061·cs.CR·July 30, 2025

Security practices in AI development

Petr Spelda, Vit Stritecky

PDF

TL;DR

This paper critically examines how security practices, rather than tools alone, shape perceptions of AI safety, highlighting shortcomings and proposing improvements to better ensure trustworthy AI development.

Contribution

It reveals the role of security practices in shaping AI safety claims and identifies gaps and shortcomings in current approaches, offering suggestions for enhancement.

Findings

01

Security practices influence AI safety perceptions more than tools.

02

Current practices have shortcomings in diversity and participation.

03

Security measures support development rather than guaranteeing safety.

Abstract

What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices attempt to fill the gap and identify several shortcomings in diversity and participation. We found that these security practices are part of securitization processes aiming to support (commercial) development of general purpose AI systems whose trustworthiness can only be imperfectly tested instead of guaranteed. We conclude by offering several improvements to the current AI security…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.