Governed MCP: Kernel-Level Tool Governance for AI Agents via Logit-Based Safety Primitives

Daeyeon Son

arXiv:2604.16870·cs.CR·April 21, 2026

Governed MCP: Kernel-Level Tool Governance for AI Agents via Logit-Based Safety Primitives

Daeyeon Son

PDF

TL;DR

This paper introduces Governed MCP, a kernel-level tool governance system for AI agents that enforces safety primitives through a multi-layered pipeline, significantly improving security over user-space solutions.

Contribution

It presents a novel kernel-resident gateway for tool call enforcement in AI agents, integrating a logit-based safety primitive and demonstrating its effectiveness and low overhead in a custom OS.

Findings

01

Removing ProbeLogits reduces F1 from 0.773 to 0.327

02

The system adds 65.3 microseconds overhead per call

03

Complete mediation of WASM-to-system functions is achieved

Abstract

AI agents increasingly call external tools (file system, network, APIs) through the Model Context Protocol (MCP). These tool calls are the agent's syscalls -- privileged operations with side effects on shared state -- yet today's safety enforcement lives entirely in userspace, where a 10-line script can bypass it. I propose Governed MCP, a kernel-resident tool governance gateway built on a logit-based safety primitive (ProbeLogits, companion paper: arXiv:2604.11943). The gateway interposes on every MCP tool call in a 6-layer pipeline: schema validation, trust tier check, rate limit, adversarial pre-filter, ProbeLogits gate (the load-bearing semantic check), and constitutional policy match, with a Blake3-hashed audit chain. I implement Governed MCP in Anima OS, a bare-metal x86_64 OS in approximately 86,000 lines of Rust. The five non-inference layers add 65.3 microseconds of overhead…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.