Security practices in AI development
Petr Spelda, Vit Stritecky

TL;DR
This paper critically examines how security practices, rather than tools alone, shape perceptions of AI safety, highlighting shortcomings and proposing improvements to better ensure trustworthy AI development.
Contribution
It reveals the role of security practices in shaping AI safety claims and identifies gaps and shortcomings in current approaches, offering suggestions for enhancement.
Findings
Security practices influence AI safety perceptions more than tools.
Current practices have shortcomings in diversity and participation.
Security measures support development rather than guaranteeing safety.
Abstract
What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices attempt to fill the gap and identify several shortcomings in diversity and participation. We found that these security practices are part of securitization processes aiming to support (commercial) development of general purpose AI systems whose trustworthiness can only be imperfectly tested instead of guaranteed. We conclude by offering several improvements to the current AI security…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
