CASCADE: A Cascaded Hybrid Defense Architecture for Prompt Injection Detection in MCP-Based Systems
\.Ipek Abas{\i}kele\c{s} Turgut, Edip G\"um\"u\c{s}

TL;DR
CASCADE is a three-layered local defense system for MCP-based LLM applications that effectively detects prompt injections and tool poisoning with high precision and low false positives.
Contribution
It introduces a novel cascaded architecture combining regex, semantic analysis, and pattern filtering for prompt injection detection in MCP systems.
Findings
Achieved 95.85% precision and 74.59% F1-score on 5,000 samples.
High detection rates for data exfiltration (91.5%) and prompt injection (84.2%).
Operates fully locally without external API dependencies.
Abstract
Model Context Protocol (MCP) is a rapidly adopted standard for defining and invoking external tools in LLM applications. The multi-layered architecture of MCP introduces new attack surfaces such as tool poisoning, in addition to traditional prompt injection. Existing defense systems suffer from limitations including high false positive rates, API dependency, or white-box access requirements. In this study, we propose CASCADE, a three-tiered cascaded defense architecture for MCP-based systems: (i) Layer 1 performs fast pre-filtering using regex, phrase weighting, and entropy analysis; (ii) Layer 2 conducts semantic analysis via BGE embedding with an Ollama Llama3 fallback mechanism; (iii) Layer 3 applies pattern-based output filtering. Evaluation on a dataset of 5,000 samples yielded 95.85% precision, 6.06% false positive rate, 61.05% recall, and 74.59% F1-score. Analysis across 31…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
