Multisource AI Scorecard Table for System Evaluation
Erik Blasch, James Sung, Tao Nguyen

TL;DR
The paper introduces a Multisource AI Scorecard Table (MAST) to standardize evaluation and promote trust in AI systems through principles aligned with intelligence community standards, supporting transparency and consistency.
Contribution
It develops a standard checklist based on IC principles for assessing AI system performance, emphasizing interpretability, validation, and stakeholder agreement.
Findings
Supports transparency and trust in AI systems.
Provides a framework for AI evaluation based on IC standards.
Includes practical use cases for security and comparison.
Abstract
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist focused on the principles of good analysis adopted by the intelligence community (IC) to help promote the development of more understandable systems and engender trust in AI outputs. Such a scorecard enables a transparent, consistent, and meaningful understanding of AI tools applied for commercial and government use. A standard is built on compliance and agreement through policy, which requires buy-in from the stakeholders. While consistency for testing might only exist across a standard data set, the community requires discussion on verification and validation approaches which can lead to interpretability, explainability, and proper use. The paper explores how the analytic tradecraft standards outlined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Transformation in Industry · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
