Executable Governance for AI: Translating Policies into Rules Using LLMs
Gautam Varma Datla, Anudeep Vurity, Tejaswani Dash, Tazeem Ahmad, Mohd Adnan, Saima Rafi

TL;DR
This paper introduces Policy-to-Tests (P2T), a framework that automatically converts natural-language AI policies into executable, machine-readable rules using LLMs, enabling scalable and accurate policy enforcement.
Contribution
The paper presents a novel pipeline and domain-specific language for translating natural-language policies into executable rules, demonstrated across various policy types with high accuracy.
Findings
Generated rules closely match human annotations
High inter-annotator agreement on gold standards
Improved safety and robustness in AI agents
Abstract
AI policy guidance is predominantly written as prose, which practitioners must first convert into executable rules before frameworks can evaluate or enforce them. This manual step is slow, error-prone, difficult to scale, and often delays the use of safeguards in real-world deployments. To address this gap, we present Policy-to-Tests (P2T), a framework that converts natural-language policy documents into normalized, machine-readable rules. The framework comprises a pipeline and a compact domain-specific language (DSL) that encodes hazards, scope, conditions, exceptions, and required evidence, yielding a canonical representation of extracted rules. To test the framework beyond a single policy, we apply it across general frameworks, sector guidance, and enterprise standards, extracting obligation-bearing clauses and converting them into executable rules. These AI-generated rules closely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
