On the Safety of Interpretable Machine Learning: A Maximum Deviation   Approach

Dennis Wei; Rahul Nair; Amit Dhurandhar; Kush R. Varshney; Elizabeth; M. Daly; Moninder Singh

arXiv:2211.01498·cs.LG·November 4, 2022·6 cites

On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach

Dennis Wei, Rahul Nair, Amit Dhurandhar, Kush R. Varshney, Elizabeth, M. Daly, Moninder Singh

PDF

Open Access 1 Video

TL;DR

This paper introduces a quantitative approach to assess the safety of interpretable machine learning models by measuring their maximum deviation from a safe reference model, using optimization and bandit techniques.

Contribution

It proposes a novel maximum deviation metric for safety assessment, applicable to various models, and demonstrates how interpretability enhances safety evaluation through case studies.

Findings

01

Exact and efficient computation of maximum deviation for decision trees and generalized linear models.

02

Bounds on maximum deviation for non-interpretable models like ensemble methods.

03

Interpretability leads to tighter safety bounds and improved model assessment.

Abstract

Interpretable and explainable machine learning has seen a recent surge of interest. We focus on safety as a key motivation behind the surge and make the relationship between interpretability and safety more quantitative. Toward assessing safety, we introduce the concept of maximum deviation via an optimization problem to find the largest deviation of a supervised learning model from a reference model regarded as safe. We then show how interpretability facilitates this safety assessment. For models including decision trees, generalized linear and additive models, the maximum deviation can be computed exactly and efficiently. For tree ensembles, which are not regarded as interpretable, discrete optimization techniques can still provide informative bounds. For a broader class of piecewise Lipschitz functions, we leverage the multi-armed bandit literature to show that interpretability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research