A Sea of Words: An In-Depth Analysis of Anchors for Text Data
Gianluigi Lopardo, Frederic Precioso, Damien Garreau

TL;DR
This paper provides a theoretical analysis of the Anchors interpretability method for text classification, examining its behavior across different models and offering insights into its selection mechanism, especially for neural networks.
Contribution
It offers the first formal analysis of Anchors for text data, including explicit results for models with TF-IDF vectorization and insights into neural network explanations.
Findings
Anchors effectively identify influential words in text classification.
Theoretical insights apply to models like linear classifiers and rule-based systems.
Empirical evidence shows Anchors select words with high partial derivatives for neural networks.
Abstract
Anchors (Ribeiro et al., 2018) is a post-hoc, rule-based interpretability method. For text data, it proposes to explain a decision by highlighting a small set of words (an anchor) such that the model to explain has similar outputs when they are present in a document. In this paper, we present the first theoretical analysis of Anchors, considering that the search for the best anchor is exhaustive. After formalizing the algorithm for text classification, we present explicit results on different classes of models when the vectorization step is TF-IDF, and words are replaced by a fixed out-of-dictionary token when removed. Our inquiry covers models such as elementary if-then rules and linear classifiers. We then leverage this analysis to gain insights on the behavior of Anchors for any differentiable classifiers. For neural networks, we empirically show that the words corresponding to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Computational and Text Analysis Methods
