Hierarchical Interpretation of Neural Text Classification
Hanqi Yan, Lin Gui, Yulan He

TL;DR
This paper introduces Hint, a hierarchical neural text classifier that explains predictions through label-associated topics, providing more faithful and human-understandable interpretations while maintaining competitive classification performance.
Contribution
It proposes a novel hierarchical interpretability method that explains neural text classification via topics instead of words, improving faithfulness and interpretability.
Findings
Achieves classification accuracy comparable to state-of-the-art models.
Generates more faithful and understandable explanations.
Effective on review and news datasets.
Abstract
Recent years have witnessed increasing interests in developing interpretable models in Natural Language Processing (NLP). Most existing models aim at identifying input features such as words or phrases important for model predictions. Neural models developed in NLP however often compose word semantics in a hierarchical manner and text classification requires hierarchical modelling to aggregate local information in order to deal with topic and label shifts more effectively. As such, interpretation by words or phrases only cannot faithfully explain model decisions in text classification. This paper proposes a novel Hierarchical INTerpretable neural text classifier, called Hint, which can automatically generate explanations of model predictions in the form of label-associated topics in a hierarchical manner. Model interpretation is no longer at the word level, but built on topics as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling
