Measuring AI Systems Beyond Accuracy

Violet Turri; Rachel Dzombak; Eric Heim; Nathan VanHoudnos; Jay Palat,; Anusha Sinha

arXiv:2204.04211·cs.SE·April 11, 2022·1 cites

Measuring AI Systems Beyond Accuracy

Violet Turri, Rachel Dzombak, Eric Heim, Nathan VanHoudnos, Jay Palat,, Anusha Sinha

PDF

Open Access

TL;DR

This paper advocates for a comprehensive, integrated approach to testing AI systems, emphasizing the need for cross-domain evaluation methods to improve reliability beyond traditional accuracy metrics.

Contribution

It introduces six key questions to guide a holistic testing and evaluation strategy for AI systems, promoting a more complete assessment framework.

Findings

01

Highlights limitations of current T&E methods

02

Proposes a set of guiding questions for holistic evaluation

03

Encourages cross-domain and lifecycle-aware testing approaches

Abstract

Current test and evaluation (T&E) methods for assessing machine learning (ML) system performance often rely on incomplete metrics. Testing is additionally often siloed from the other phases of the ML system lifecycle. Research investigating cross-domain approaches to ML T&E is needed to drive the state of the art forward and to build an Artificial Intelligence (AI) engineering discipline. This paper advocates for a robust, integrated approach to testing by outlining six key questions for guiding a holistic T&E strategy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Fault Detection and Control Systems · Machine Learning and Data Classification