Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks

Doohee You

arXiv:2605.18988·cs.CR·May 20, 2026

Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks

Doohee You

PDF

TL;DR

This paper introduces TRIAD, a predictive framework for detecting and mitigating novel multimodal, multi-turn adversarial attacks on large language models by modeling conversational trajectories and structural anomalies.

Contribution

The paper proposes a novel dynamic safety verification framework, TRIAD, combining trajectory analysis, anomaly detection, and hazard modeling for real-time defense against unseen multimodal attacks.

Findings

01

TRIAD provides a mathematically bounded expected time-to-failure under attack.

02

The framework effectively detects structural anomalies and malicious drift in multimodal conversations.

03

TRIAD offers a computationally efficient and interpretable safety safeguard for AI systems.

Abstract

The expansion of Multimodal Large Language Models (MLLMs) and their integration into autonomous agentic workflows has introduced a non-stationary attack surface. Empirical observations indicate that adversaries employ progressive, cross-modal perturbations that evade turn-specific guardrails by distributing malicious intent across longitudinal conversational trajectories. Static defense mechanisms, constrained by the Markov property, evaluate inputs in isolation and fail to detect cumulative structural poisoning. To handle this limitation, this paper formulates safety verification as a dynamic survival prediction and trajectory dynamics problem. The Triple-tier Anomaly Defense (TRIAD) framework is proposed as a predictive model that maps multimodal and multi-turn conversational flow as a continuous trajectory. The framework integrates structural anomaly detection to monitor covariance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.