DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs
Justin Albrethsen, Yash Datta, Kunal Kumar, Sharath Rajasekar

TL;DR
DeepContext introduces a stateful RNN-based framework for real-time detection of multi-turn adversarial intent drift in LLMs, significantly improving safety measures by capturing the temporal evolution of user intent.
Contribution
The paper presents DeepContext, a novel stateful monitoring system that models conversation sequences to detect adversarial intent drift, outperforming existing stateless approaches.
Findings
Achieves a state-of-the-art F1 score of 0.84 in multi-turn jailbreak detection.
Outperforms hyperscaler and open-weight models like Llama-Prompt-Guard-2 and Granite-Guardian.
Maintains sub-20ms inference latency on a T4 GPU, suitable for real-time deployment.
Abstract
While Large Language Model (LLM) capabilities have scaled, safety guardrails remain largely stateless, treating multi-turn dialogues as a series of disconnected events. This lack of temporal awareness facilitates a "Safety Gap" where adversarial tactics, like Crescendo and ActorAttack, slowly bleed malicious intent across turn boundaries to bypass stateless filters. We introduce DeepContext, a stateful monitoring framework designed to map the temporal trajectory of user intent. DeepContext discards the isolated evaluation model in favor of a Recurrent Neural Network (RNN) architecture that ingests a sequence of fine-tuned turn-level embeddings. By propagating a hidden state across the conversation, DeepContext captures the incremental accumulation of risk that stateless models overlook. Our evaluation demonstrates that DeepContext significantly outperforms existing baselines in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)
