Comparing Feature Importance and Rule Extraction for Interpretability on Text Data

Gianluigi Lopardo; Damien Garreau

arXiv:2207.01420·cs.LG·October 22, 2025

Comparing Feature Importance and Rule Extraction for Interpretability on Text Data

Gianluigi Lopardo, Damien Garreau

PDF

Open Access 1 Repo

TL;DR

This paper compares feature importance and rule extraction interpretability methods for text data, revealing that different methods can produce significantly different explanations even for simple models, and introduces a new way to compare these explanations.

Contribution

It introduces a novel approach to quantitatively compare explanations from different interpretability methods applied to text data models.

Findings

01

Different interpretability methods can produce contrasting explanations.

02

The proposed comparison approach quantifies differences between explanation methods.

03

Explanations can vary significantly even for simple models.

Abstract

Complex machine learning algorithms are used more and more often in critical tasks involving text data, leading to the development of interpretability methods. Among local methods, two families have emerged: those computing importance scores for each feature and those extracting simple logical rules. In this paper we show that using different methods can lead to unexpectedly different explanations, even when applied to simple models for which we would expect qualitative coincidence. To quantify this effect, we propose a new approach to compare explanations produced by different methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gianluigilopardo/anchors_vs_lime_text
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Machine Learning and Data Classification