Faithful and Plausible Natural Language Explanations for Image   Classification: A Pipeline Approach

Adam Wojciechowski; Mateusz Lango; Ondrej Dusek

arXiv:2407.20899·cs.AI·March 19, 2025

Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach

Adam Wojciechowski, Mateusz Lango, Ondrej Dusek

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a post-hoc natural language explanation pipeline for CNN image classifiers that produces faithful, plausible, and grounded explanations without affecting the classifier's performance.

Contribution

It presents a novel pipeline approach that generates faithful and plausible natural language explanations by analyzing influential neurons and activation maps, applicable to any CNN classifier.

Findings

01

Generated explanations are more faithful and plausible than baselines.

02

User interventions improve explanation quality threefold.

03

Method does not alter classifier training or performance.

Abstract

Existing explanation methods for image classification struggle to provide faithful and plausible explanations. This paper addresses this issue by proposing a post-hoc natural language explanation method that can be applied to any CNN-based classifier without altering its training process or affecting predictive performance. By analysing influential neurons and the corresponding activation maps, the method generates a faithful description of the classifier's decision process in the form of a structured meaning representation, which is then converted into text by a language model. Through this pipeline approach, the generated explanations are grounded in the neural network architecture, providing accurate insight into the classification process while remaining accessible to non-experts. Experimental results show that the NLEs constructed by our method are significantly more plausible and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wojciechowskiofficial/flex
pytorchOfficial

Videos

Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach· underline

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)