A Trace-Based Assurance Framework for Agentic AI Orchestration: Contracts, Testing, and Governance
Ciprian Paduraru, Petru-Liviu Bouruc, Alin Stefanescu

TL;DR
This paper introduces a comprehensive assurance framework for agentic AI systems using LLMs, focusing on contracts, testing, and governance to ensure reliability, containment, and compliance in complex, multi-agent interactions.
Contribution
It proposes a novel trace-based assurance framework with explicit contracts, stress testing, fault injection, and governance mechanisms for agentic AI orchestration.
Findings
Contracts enable machine-checkable verdicts and trace localization.
Stress testing identifies vulnerabilities through bounded perturbations.
Governance enforces runtime capability limits and action controls.
Abstract
In Agentic AI, Large Language Models (LLMs) are increasingly used in the orchestration layer to coordinate multiple agents and to interact with external services, retrieval components, and shared memory. In this setting, failures are not limited to incorrect final outputs. They also arise from long-horizon interaction, stochastic decisions, and external side effects (such as API calls, database writes, and message sends). Common failures include non-termination, role drift, propagation of unsupported claims, and attacks via untrusted context or external channels. This paper presents an assurance framework for such Agentic AI systems. Executions are instrumented as Message-Action Traces (MAT) with explicit step and trace contracts. Contracts provide machine-checkable verdicts, localize the first violating step, and support deterministic replay. The framework includes stress testing,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Access Control and Trust · Software System Performance and Reliability
