Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines
Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala, Yoonpyo Lee, Syed Bahauddin Alam, Sajedul Talukder

TL;DR
This paper presents Semantic Intent Fragmentation (SIF), a novel attack exploiting multi-agent AI systems' safety gaps by decomposing tasks into subtasks that appear benign but violate security policies when combined.
Contribution
The paper introduces SIF, a new attack method against LLM orchestration systems, demonstrating its effectiveness and proposing detection strategies to close the safety gap.
Findings
SIF successfully causes policy violations in 71% of tested scenarios.
Multiple signals can detect SIF attacks with 0% false positives.
Stronger orchestrators increase the success rate of SIF attacks.
Abstract
We introduce Semantic Intent Fragmentation (SIF), an attack class against LLM orchestration systems where a single, legitimately phrased request causes an orchestrator to decompose a task into subtasks that are individually benign but jointly violate security policy. Current safety mechanisms operate at the subtask level, so each step clears existing classifiers -- the violation only emerges at the composed plan. SIF exploits OWASP LLM06:2025 through four mechanisms: bulk scope escalation, silent data exfiltration, embedded trigger deployment, and quasi-identifier aggregation, requiring no injected content, no system modification, and no attacker interaction after the initial request. We construct a three-stage red-teaming pipeline grounded in OWASP, MITRE ATLAS, and NIST frameworks to generate realistic enterprise scenarios. Across 14 scenarios spanning financial reporting, information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
