Dynamic Risk Assessments for Offensive Cybersecurity Agents
Boyi Wei, Benedikt Stroebl, Jiacen Xu, Joie Zhang, Zhou Li, Peter Henderson

TL;DR
This paper emphasizes the importance of dynamic risk assessments for offensive cybersecurity agents, demonstrating that adversaries can significantly improve agent capabilities within limited compute budgets, which current audits often overlook.
Contribution
It introduces an expanded threat model considering adversaries' degrees of freedom and demonstrates the potential for substantial capability improvements through iterative optimization.
Findings
Adversaries can improve cybersecurity capabilities by over 40% within 8 GPU hours.
Current audits often underestimate risks by not accounting for iterative adversarial improvements.
Dynamic risk evaluation provides a more accurate picture of potential threats.
Abstract
Foundation models are increasingly becoming better autonomous programmers, raising the prospect that they could also automate dangerous offensive cyber-operations. Current frontier model audits probe the cybersecurity risks of such agents, but most fail to account for the degrees of freedom available to adversaries in the real world. In particular, with strong verifiers and financial incentives, agents for offensive cybersecurity are amenable to iterative improvement by would-be adversaries. We argue that assessments should take into account an expanded threat model in the context of cybersecurity, emphasizing the varying degrees of freedom that an adversary may possess in stateful and non-stateful environments within a fixed compute budget. We show that even with a relatively small compute budget (8 H100 GPU Hours in our study), adversaries can improve an agent's cybersecurity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsInformation and Cyber Security · Network Security and Intrusion Detection · Smart Grid Security and Resilience
