MCIP: Protecting MCP Safety via Model Contextual Integrity Protocol
Huihao Jing, Haoran Li, Wenbin Hu, Qi Hu, Heli Xu, Tianshu Chu, Peizhao Hu, Yangqiu Song

TL;DR
This paper introduces MCIP, a protocol to improve safety in Model Context Protocol (MCP) systems by analyzing safety gaps, creating a taxonomy of unsafe behaviors, and developing benchmarks to evaluate and enhance LLM safety.
Contribution
It proposes MCIP, a refined MCP protocol addressing safety gaps, along with a taxonomy and benchmarks for evaluating LLM safety in MCP interactions.
Findings
LLMs show vulnerabilities in MCP safety scenarios.
The proposed approach significantly improves LLM safety performance.
Benchmark data supports safety evaluation and training.
Abstract
As Model Context Protocol (MCP) introduces an easy-to-use ecosystem for users and developers, it also brings underexplored safety risks. Its decentralized architecture, which separates clients and servers, poses unique challenges for systematic safety analysis. This paper proposes a novel framework to enhance MCP safety. Guided by the MAESTRO framework, we first analyze the missing safety mechanisms in MCP, and based on this analysis, we propose the Model Contextual Integrity Protocol (MCIP), a refined version of MCP that addresses these gaps. Next, we develop a fine-grained taxonomy that captures a diverse range of unsafe behaviors observed in MCP scenarios. Building on this taxonomy, we develop benchmark and training data that support the evaluation and improvement of LLMs' capabilities in identifying safety risks within MCP interactions. Leveraging the proposed benchmark and training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDistributed systems and fault tolerance · Security and Verification in Computing · Software Testing and Debugging Techniques
