Multi-Granular Text Encoding for Self-Explaining Categorization

Zhiguo Wang; Yue Zhang; Mo Yu; Wei Zhang; Lin Pan; Linfeng Song; Kun; Xu; Yousef El-Kurdi

arXiv:1907.08532·cs.CL·July 22, 2019·1 cites

Multi-Granular Text Encoding for Self-Explaining Categorization

Zhiguo Wang, Yue Zhang, Mo Yu, Wei Zhang, Lin Pan, Linfeng Song, Kun, Xu, Yousef El-Kurdi

PDF

Open Access

TL;DR

This paper introduces a hierarchical multi-granular n-gram encoding method using tree-structured LSTM for self-explaining text categorization, improving accuracy, efficiency, and interpretability in medical diagnosis tasks.

Contribution

It proposes a novel hierarchical n-gram organization with tree-LSTM for better explanation and performance in self-explaining text classification.

Findings

01

Outperforms BiLSTM and CNN baselines in accuracy

02

More efficient and compact model architecture

03

Provides intuitive multi-granular evidence for predictions

Abstract

Self-explaining text categorization requires a classifier to make a prediction along with supporting evidence. A popular type of evidence is sub-sequences extracted from the input text which are sufficient for the classifier to make the prediction. In this work, we define multi-granular ngrams as basic units for explanation, and organize all ngrams into a hierarchical structure, so that shorter ngrams can be reused while computing longer ngrams. We leverage a tree-structured LSTM to learn a context-independent representation for each unit via parameter sharing. Experiments on medical disease classification show that our model is more accurate, efficient and compact than BiLSTM and CNN baselines. More importantly, our model can extract intuitive multi-granular evidence to support its predictions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsSigmoid Activation · Tanh Activation · Bidirectional LSTM · Long Short-Term Memory