AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems
Zhiyu Zhu, Zhibo Jin, Hongsheng Hu, Minhui Xue, Ruoxi Sun, Seyit, Camtepe, Praveen Gauravaram, and Huaming Chen

TL;DR
AI-Compass is a multi-module testing tool that thoroughly evaluates AI systems' robustness, interpretability, and neuron behavior across different modalities, addressing current limitations in comprehensive testing.
Contribution
The paper introduces AI-Compass, a novel comprehensive testing tool that assesses multiple aspects of AI systems simultaneously, improving over existing ad-hoc testing methods.
Findings
AI-Compass effectively evaluates adversarial robustness, interpretability, and neuron analysis.
The tool demonstrates state-of-the-art performance across image, object detection, and text classification tasks.
Extensive experiments validate its comprehensive assessment capabilities.
Abstract
AI systems, in particular with deep learning techniques, have demonstrated superior performance for various real-world applications. Given the need for tailored optimization in specific scenarios, as well as the concerns related to the exploits of subsurface vulnerabilities, a more comprehensive and in-depth testing AI system becomes a pivotal topic. We have seen the emergence of testing tools in real-world applications that aim to expand testing capabilities. However, they often concentrate on ad-hoc tasks, rendering them unsuitable for simultaneously testing multiple aspects or components. Furthermore, trustworthiness issues arising from adversarial attacks and the challenge of interpreting deep learning models pose new challenges for developing more comprehensive and in-depth AI system testing tools. In this study, we design and implement a testing tool, \tool, to comprehensively and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
