Taming Silent Failures: A Framework for Verifiable AI Reliability
Guan-Yan Yang, Farn Wang

TL;DR
This paper presents FAME, a framework combining formal synthesis and runtime monitoring to detect silent AI failures in safety-critical systems, demonstrated on autonomous vehicle perception with high detection rates.
Contribution
Introduces FAME, a novel framework that integrates formal verification and runtime monitoring to ensure AI reliability and safety in critical applications.
Findings
FAME detected 93.5% of critical safety violations in autonomous vehicles.
Framework aligns with ISO safety standards for practical deployment.
Proves the feasibility of certifiable, trustworthy AI in safety-critical systems.
Abstract
The integration of Artificial Intelligence (AI) into safety-critical systems introduces a new reliability paradigm: silent failures, where AI produces confident but incorrect outputs that can be dangerous. This paper introduces the Formal Assurance and Monitoring Environment (FAME), a novel framework that confronts this challenge. FAME synergizes the mathematical rigor of offline formal synthesis with the vigilance of online runtime monitoring to create a verifiable safety net around opaque AI components. We demonstrate its efficacy in an autonomous vehicle perception system, where FAME successfully detected 93.5% of critical safety violations that were otherwise silent. By contextualizing our framework within the ISO 26262 and ISO/PAS 8800 standards, we provide reliability engineers with a practical, certifiable pathway for deploying trustworthy AI. FAME represents a crucial shift from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
