On the Importance of Application-Grounded Experimental Design for   Evaluating Explainable ML Methods

Kasun Amarasinghe; Kit T. Rodolfa; S\'ergio Jesus; Valerie Chen,; Vladimir Balayan; Pedro Saleiro; Pedro Bizarro; Ameet Talwalkar; Rayid Ghani

arXiv:2206.13503·cs.LG·February 23, 2023·5 cites

On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods

Kasun Amarasinghe, Kit T. Rodolfa, S\'ergio Jesus, Valerie Chen,, Vladimir Balayan, Pedro Saleiro, Pedro Bizarro, Ameet Talwalkar, Rayid Ghani

PDF

Open Access 1 Video

TL;DR

This study emphasizes the importance of application-grounded experimental design in evaluating explainable ML methods, revealing that prior conclusions may be misleading without realistic deployment context considerations.

Contribution

It demonstrates how realistic experimental setups can lead to different conclusions about explainable ML methods' utility, advocating for context-aware evaluations.

Findings

01

No evidence of utility for tested methods in realistic setting

02

Simplistic evaluation designs can mislead conclusions

03

Highlights need for context-specific explainability methods

Abstract

Most existing evaluations of explainable machine learning (ML) methods rely on simplifying assumptions or proxies that do not reflect real-world use cases; the handful of more robust evaluations on real-world settings have shortcomings in their design, resulting in limited conclusions of methods' real-world utility. In this work, we seek to bridge this gap by conducting a study that evaluates three popular explainable ML methods in a setting consistent with the intended deployment context. We build on a previous study on e-commerce fraud detection and make crucial modifications to its setup relaxing the simplifying assumptions made in the original work that departed from the deployment context. In doing so, we draw drastically different conclusions from the earlier work and find no evidence for the incremental utility of the tested methods in the task. Our results highlight how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods· underline

Taxonomy

TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques