Explaining Text Similarity in Transformer Models

Alexandros Vasileiou; Oliver Eberle

arXiv:2405.06604·cs.CL·May 13, 2024

Explaining Text Similarity in Transformer Models

Alexandros Vasileiou, Oliver Eberle

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores how layer-wise relevance propagation (LRP) and BiLRP can explain the inner workings of Transformer-based similarity models in NLP, providing insights into feature interactions and semantic understanding.

Contribution

It introduces the use of BiLRP for second-order explanations in Transformer similarity models, enabling detailed analysis of feature interactions in NLP tasks.

Findings

01

BiLRP effectively reveals feature interactions driving similarity.

02

Explainability methods improve understanding of multilingual semantics.

03

Insights assist in biomedical text retrieval analysis.

Abstract

As Transformers have become state-of-the-art models for natural language processing (NLP) tasks, the need to understand and explain their predictions is increasingly apparent. Especially in unsupervised applications, such as information retrieval tasks, similarity models built on top of foundation model representations have been widely applied. However, their inner prediction mechanisms have mostly remained opaque. Recent advances in explainable AI have made it possible to mitigate these limitations by leveraging improved explanations for Transformers through layer-wise relevance propagation (LRP). Using BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, we investigate which feature interactions drive similarity in NLP models. We validate the resulting explanations and demonstrate their utility in three corpus-level use cases, analyzing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alevas/xai_similarity_transformers
pytorchOfficial

Videos

Explaining Text Similarity in Transformer Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Digital Humanities and Scholarship