A Human-Grounded Evaluation Benchmark for Local Explanations of Machine   Learning

Sina Mohseni; Jeremy E. Block; Eric D. Ragan

arXiv:1801.05075·cs.HC·June 30, 2020·47 cites

A Human-Grounded Evaluation Benchmark for Local Explanations of Machine Learning

Sina Mohseni, Jeremy E. Block, Eric D. Ragan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a human attention benchmark for evaluating model explanations in image and text domains, demonstrating its effectiveness over traditional segmentation masks and revealing user biases in subjective ratings.

Contribution

It proposes a multi-layer human attention mask benchmark for explanation evaluation and demonstrates its advantages over existing ground-truth methods.

Findings

01

The benchmark effectively evaluates explanations using human attention data.

02

Threshold-agnostic evaluation surpasses single-layer segmentation masks.

03

User biases influence subjective ratings of model explanations.

Abstract

Research in interpretable machine learning proposes different computational and human subject approaches to evaluate model saliency explanations. These approaches measure different qualities of explanations to achieve diverse goals in designing interpretable machine learning systems. In this paper, we propose a human attention benchmark for image and text domains using multi-layer human attention masks aggregated from multiple human annotators. We then present an evaluation study to evaluate model saliency explanations obtained using Grad-cam and LIME techniques. We demonstrate our benchmark's utility for quantitative evaluation of model explanations by comparing it with human subjective ratings and ground-truth single-layer segmentation masks evaluations. Our study results show that our threshold agnostic evaluation method with the human attention baseline is more effective than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SinaMohseni/ML-Interpretability-Evaluation-Benchmark
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Scientific Computing and Data Management · Data Visualization and Analytics

MethodsLocal Interpretable Model-Agnostic Explanations