The Two Boundaries: Why Behavioral AI Governance Fails Structurally
Alan L. McCann

TL;DR
This paper analyzes the structural limitations of behavioral AI governance, demonstrating that effective governance requires architectural separation of effects from computation, and introduces coterminous governance as a key criterion.
Contribution
It provides a formal framework showing the inherent undecidability of behavioral effect governance and proposes coterminous governance as a necessary architectural design.
Findings
Rice's theorem implies the undecidability of effect compliance in general.
Coterminous governance requires separating computation from effects.
Proofs of the framework are mechanized in Coq with 454 theorems.
Abstract
Every system that performs effects has two boundaries: what it can do (expressiveness) and what governance covers (governance). In nearly all deployed AI systems, these boundaries are defined independently, creating three regions: governed capabilities (the only useful region), ungoverned capabilities (risk), and governance policies that address non-existent capabilities (theater). Two of the three regions are failure modes. We focus on the governance of effects: actions that AI systems perform in the world (API calls, database writes, tool invocations). This is distinct from the governance of model outputs (content quality, bias, fairness), which operates at a different level and requires different mechanisms. We present a formal framework for analyzing this structural gap. Rice's theorem (1953) proves the gap is undecidable in the general case for any Turing-complete architecture that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
