Explaining word embeddings with perfect fidelity: Case study in research impact prediction
Lucie Dvorackova, Marcin P. Joachimiak, Michal Cerny, Adriana Kubecova, Vilem Sklenak, Tomas Kliegr

TL;DR
This paper introduces SMER, a new explanation method for word embedding-based classifiers that guarantees perfect fidelity, improving interpretability of research impact prediction models.
Contribution
The paper presents SMER, a novel feature importance method with theoretically perfect fidelity for logistic regression models trained on word embeddings.
Findings
SMER outperforms LIME, SHAP, and global surrogates in explanation quality.
SMER provides exact correspondence between feature importance and model logits.
Evaluation on 50,000 research articles demonstrates SMER’s effectiveness.
Abstract
The best-performing approaches for scholarly document quality prediction are based on embedding models. In addition to their performance when used in classifiers, embedding models can also provide predictions even for words that were not contained in the labelled training data for the classification model, which is important in the context of the ever-evolving research terminology. Although model-agnostic explanation methods, such as Local interpretable model-agnostic explanations, can be applied to explain machine learning classifiers trained on embedding models, these produce results with questionable correspondence to the model. We introduce a new feature importance method, Self-model Rated Entities (SMER), for logistic regression-based classification models trained on word embeddings. We show that SMER has theoretically perfect fidelity with the explained model, as the average of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsLocal Interpretable Model-Agnostic Explanations
