Governed MCP: Kernel-Level Tool Governance for AI Agents via Logit-Based Safety Primitives
Daeyeon Son

TL;DR
This paper introduces Governed MCP, a kernel-level tool governance system for AI agents that enforces safety primitives through a multi-layered pipeline, significantly improving security over user-space solutions.
Contribution
It presents a novel kernel-resident gateway for tool call enforcement in AI agents, integrating a logit-based safety primitive and demonstrating its effectiveness and low overhead in a custom OS.
Findings
Removing ProbeLogits reduces F1 from 0.773 to 0.327
The system adds 65.3 microseconds overhead per call
Complete mediation of WASM-to-system functions is achieved
Abstract
AI agents increasingly call external tools (file system, network, APIs) through the Model Context Protocol (MCP). These tool calls are the agent's syscalls -- privileged operations with side effects on shared state -- yet today's safety enforcement lives entirely in userspace, where a 10-line script can bypass it. I propose Governed MCP, a kernel-resident tool governance gateway built on a logit-based safety primitive (ProbeLogits, companion paper: arXiv:2604.11943). The gateway interposes on every MCP tool call in a 6-layer pipeline: schema validation, trust tier check, rate limit, adversarial pre-filter, ProbeLogits gate (the load-bearing semantic check), and constitutional policy match, with a Blake3-hashed audit chain. I implement Governed MCP in Anima OS, a bare-metal x86_64 OS in approximately 86,000 lines of Rust. The five non-inference layers add 65.3 microseconds of overhead…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
