OpenSec: Measuring Incident Response Agent Calibration Under Adversarial Evidence

Jarrod Barnes

arXiv:2601.21083·cs.AI·February 10, 2026

OpenSec: Measuring Incident Response Agent Calibration Under Adversarial Evidence

Jarrod Barnes

PDF

Open Access 1 Datasets

TL;DR

OpenSec introduces a new RL environment to evaluate incident response agents' calibration under adversarial evidence, revealing significant over-triggering and calibration gaps in frontier models like GPT-5.2 and Claude Sonnet 4.5.

Contribution

It provides a realistic benchmark for IR agents that isolates calibration failures under adversarial prompt injections, a gap in existing evaluation methods.

Findings

01

GPT-5.2 triggers containment in all episodes with high false positives.

02

Claude Sonnet 4.5 shows partial calibration with lower false positives.

03

Calibration issues are in restraint, not threat detection.

Abstract

As large language models (LLMs) improve, so do their offensive applications: frontier agents now generate working exploits for under $50 in compute (Heelan, 2026). Defensive incident response (IR) agents must keep pace, but existing benchmarks conflate action execution with correct execution, hiding calibration failures when agents process adversarial evidence. We introduce OpenSec, a dual-control reinforcement learning (RL) environment that evaluates IR agents under realistic prompt injection scenarios with execution-based scoring: time-to-first-containment (TTFC), evidence-gated action rate (EGAR), blast radius, and per-tier injection violation rates. Evaluating four frontier models on 40 standard-tier episodes each, we find consistent over-triggering: GPT-5.2 executes containment in 100% of episodes with 82.5% false positive rate, acting at step 4 before gathering sufficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Jarrodbarnes/opensec-seeds
dataset· 106 dl
106 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI