A Holistic Assessment of the Reliability of Machine Learning Systems

Anthony Corso; David Karamadian; Romeo Valentin; Mary Cooper; Mykel J.; Kochenderfer

arXiv:2307.10586·cs.LG·August 1, 2023·6 cites

A Holistic Assessment of the Reliability of Machine Learning Systems

Anthony Corso, David Karamadian, Romeo Valentin, Mary Cooper, Mykel J., Kochenderfer

PDF

Open Access

TL;DR

This paper introduces a comprehensive framework for assessing the reliability of machine learning systems across multiple key properties, providing a holistic view of their robustness in high-stakes applications.

Contribution

It proposes a new holistic assessment methodology and reliability score, evaluating multiple reliability metrics and analyzing over 500 models to identify techniques that improve overall system dependability.

Findings

01

Designing for one metric does not limit others

02

Certain algorithms can enhance multiple reliability aspects simultaneously

03

The framework offers a comprehensive reliability evaluation approach

Abstract

As machine learning (ML) systems increasingly permeate high-stakes settings such as healthcare, transportation, military, and national security, concerns regarding their reliability have emerged. Despite notable progress, the performance of these systems can significantly diminish due to adversarial attacks or environmental changes, leading to overconfident predictions, failures to detect input faults, and an inability to generalize in unexpected scenarios. This paper proposes a holistic assessment methodology for the reliability of ML systems. Our framework evaluates five key properties: in-distribution accuracy, distribution-shift robustness, adversarial robustness, calibration, and out-of-distribution detection. A reliability score is also introduced and used to assess the overall system reliability. To provide insights into the performance of different algorithmic approaches, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Software Reliability and Analysis Research