Towards Interpretable Natural Language Understanding with Explanations as Latent Variables
Wangchunshu Zhou, Jinyi Hu, Hanlin Zhang, Xiaodan Liang, Maosong Sun,, Chenyan Xiong, Jian Tang

TL;DR
This paper introduces a framework for interpretable natural language understanding that uses explanations as latent variables, enabling effective predictions with limited annotated explanations and semi-supervised learning.
Contribution
It proposes a variational EM-based approach treating explanations as latent variables, reducing the need for extensive annotated data and enhancing interpretability and performance.
Findings
Effective in supervised and semi-supervised settings
Generates high-quality natural language explanations
Improves prediction accuracy with limited annotations
Abstract
Recently generating natural language explanations has shown very promising results in not only offering interpretable explanations but also providing additional information and supervision for prediction. However, existing approaches usually require a large set of human annotated explanations for training while collecting a large set of explanations is not only time consuming but also expensive. In this paper, we develop a general framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training. Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model. We develop a variational EM framework for optimization where an explanation generation module and an explanation-augmented prediction module are alternatively optimized and mutually enhance each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
