Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems

Aaditya Pai

arXiv:2605.22001·cs.CR·May 22, 2026

Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems

Aaditya Pai

PDF

TL;DR

This paper uncovers a significant blind spot in injection detection for multi-agent LLM systems, where domain-camouflaged payloads evade detection, highlighting a critical vulnerability and proposing a framework to analyze it.

Contribution

The authors formalize the Camouflage Detection Gap (CDG), demonstrate its significance across multiple models and tasks, and release tools to evaluate and address this vulnerability.

Findings

01

Detection rates drop from 93.8% to 9.7% for camouflaged payloads on Llama 3.1 8B.

02

Camouflage attacks significantly increase success in multi-agent debate architectures.

03

Targeted detector augmentation offers limited remediation, indicating architectural vulnerability.

Abstract

Injection detectors deployed to protect LLM agents are calibrated on static, template-based payloads that announce themselves as override directives. We identify a systematic blind spot: when payloads are generated to mimic the domain vocabulary and authority structures of the target document, what we call domain camouflaged injection, standard detectors fail to flag them, with detection rates dropping from 93.8% to 9.7% on Llama 3.1 8B and from 100% to 55.6% on Gemini 2.0 Flash. We formalize this as the Camouflage Detection Gap (CDG), the difference in injection detection rate between static and camouflaged payloads. Across 45 tasks spanning three domains and two model families, CDG is large and statistically significant (chi^2 = 38.03, p < 0.001 for Llama; chi^2 = 17.05, p < 0.001 for Gemini), with zero reverse discordant pairs in either case. We additionally evaluate Llama Guard 3, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.