Towards Benchmarking the Utility of Explanations for Model Debugging
Maximilian Idahl, Lijun Lyu, Ujwal Gadiraju, Avishek Anand

TL;DR
This paper advocates for developing a benchmark to evaluate the usefulness of post-hoc explanation methods in debugging text classifiers, emphasizing the importance of both effectiveness and efficiency in explanations.
Contribution
It introduces the concept of a benchmark for evaluating explanation utility in model debugging and outlines desirable properties for such a benchmark.
Findings
Identifies key properties for an explanation utility benchmark
Highlights the importance of evaluating both effectiveness and efficiency
Proposes a first step towards standardized evaluation of explanation methods
Abstract
Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model's decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
