Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms

Ari Azarafrooz

arXiv:2604.21131·cs.CR·April 24, 2026

Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms

Ari Azarafrooz

PDF

1 Repo

TL;DR

This paper introduces CSTM-Bench, a comprehensive dataset and evaluation framework for cross-session threat detection in AI agents, highlighting the challenges and proposing algorithms to improve detection robustness.

Contribution

It provides a new dataset, measurement methodology, and algorithms for cross-session threat detection, addressing limitations of memoryless guardrails in AI agents.

Findings

01

Detection recall drops by half when moving from isolated to cross-session scenarios.

02

Bounded-memory coreset reader maintains higher recall across shards.

03

Introducing CSR_prefix as a stability metric improves detection benchmarking.

Abstract

AI-agent guardrails are memoryless: each message is judged in isolation, so an adversary who spreads a single attack across dozens of sessions slips past every session-bound detector because only the aggregate carries the payload. We make three contributions to cross-session threat detection. (1) Dataset. CSTM-Bench is 26 executable attack taxonomies classified by kill-chain stage and cross-session operation (accumulate, compose, launder, inject_on_reader), each bound to one of seven identity anchors that ground-truth "violation" as a policy predicate, plus matched Benign-pristine and Benign-hard confounders. Released on Hugging Face as intrinsec-ai/cstm-bench with two 54-scenario splits: dilution (compositional) and cross_session (12 isolation-invisible scenarios produced by a closed-loop rewriter that softens surface phrasing while preserving cross-session artefacts). (2)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/intrinsec-ai/cstm-bench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.