Explaining Classes through Word Attribution

Samuel R\"onnqvist; Amanda Myntti; Aki-Juhani Kyr\"ol\"ainen; Sampo; Pyysalo; Veronika Laippala; Filip Ginter

arXiv:2108.13653·cs.CL·September 1, 2021

Explaining Classes through Word Attribution

Samuel R\"onnqvist, Amanda Myntti, Aki-Juhani Kyr\"ol\"ainen, Sampo, Pyysalo, Veronika Laippala, Filip Ginter

PDF

Open Access

TL;DR

This paper introduces a method to explain how deep learning models perceive classes in text classification by aggregating individual prediction explanations, effectively identifying key class-specific keywords.

Contribution

The study presents a novel approach combining Integrated Gradients with aggregation to generate class-level explanations in text classification tasks.

Findings

01

The method successfully identifies meaningful keywords for most classes.

02

It works well on Web register classification with the XML-R model.

03

Small classes may have less discriminative keyword explanations.

Abstract

In recent years, several methods have been proposed for explaining individual predictions of deep learning models, yet there has been little study of how to aggregate these predictions to explain how such models view classes as a whole in text classification tasks. In this work, we propose a method for explaining classes using deep learning models and the Integrated Gradients feature attribution technique by aggregating explanations of individual examples in text classification to general descriptions of the classes. We demonstrate the approach on Web register (genre) classification using the XML-R model and the Corpus of Online Registers of English (CORE), finding that the method identifies plausible and discriminative keywords characterizing all but the smallest class.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques