Learning with Interpretable Structure from Gated RNN

Bo-Jian Hou; Zhi-Hua Zhou

arXiv:1810.10708·cs.NE·January 15, 2020·6 cites

Learning with Interpretable Structure from Gated RNN

Bo-Jian Hou, Zhi-Hua Zhou

PDF

Open Access

TL;DR

This paper explores extracting finite state automata from gated RNNs to improve interpretability, demonstrating that FSAs are more trustworthy and can reveal how RNNs process text classification.

Contribution

It introduces two methods for learning FSAs from RNNs, providing a more interpretable structure and insights into RNN inner mechanisms.

Findings

01

FSA learned from RNNs is more trustable than the original RNN.

02

Fewer gates in RNNs can still maintain performance, guiding RNN design.

03

FSA states correspond to semantic concepts in text classification.

Abstract

The interpretability of deep learning models has raised extended attention these years. It will be beneficial if we can learn an interpretable structure from deep learning models. In this paper, we focus on Recurrent Neural Networks~(RNNs) especially gated RNNs whose inner mechanism is still not clearly understood. We find that Finite State Automaton~(FSA) that processes sequential data has more interpretable inner mechanism according to the definition of interpretability and can be learned from RNNs as the interpretable structure. We propose two methods to learn FSA from RNN based on two different clustering methods. With the learned FSA and via experiments on artificial and real datasets, we find that FSA is more trustable than the RNN from which it learned, which gives FSA a chance to substitute RNNs in applications involving humans' lives or dangerous facilities. Besides, we analyze…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning in Materials Science · Topic Modeling

MethodsInterpretability