Does It Make Sense to Explain a Black Box With Another Black Box?

Julien Delaunay; Luis Gal\'arraga; Christine Largou\"et

arXiv:2404.14943·cs.CL·April 24, 2024·1 cites

Does It Make Sense to Explain a Black Box With Another Black Box?

Julien Delaunay, Luis Gal\'arraga, Christine Largou\"et

PDF

Open Access 1 Repo

TL;DR

This paper compares transparent and opaque counterfactual explanation methods in NLP, finding that opaque approaches often add unnecessary complexity without improving performance in tasks like fake news detection and sentiment analysis.

Contribution

It provides a comparative analysis of two main families of counterfactual explanation methods in NLP and questions the necessity of using black-box explanations for black-box models.

Findings

01

Opaque methods do not significantly outperform transparent ones.

02

Opaque approaches add complexity without performance gains.

03

Using black-box explanations for black-box models may be unnecessary.

Abstract

Although counterfactual explanations are a popular approach to explain ML black-box classifiers, they are less widespread in NLP. Most methods find those explanations by iteratively perturbing the target document until it is classified differently by the black box. We identify two main families of counterfactual explanation methods in the literature, namely, (a) \emph{transparent} methods that perturb the target by adding, removing, or replacing words, and (b) \emph{opaque} approaches that project the target document into a latent, non-interpretable space where the perturbation is carried out subsequently. This article offers a comparative study of the performance of these two families of methods on three classical NLP tasks. Our empirical evidence shows that opaque approaches can be an overkill for downstream applications such as fake news detection or sentiment analysis since they add…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

j2launay/ebbwbb
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBenford’s Law and Fraud Detection · Financial Markets and Investment Strategies · Computability, Logic, AI Algorithms