TL;DR
This paper introduces NILE, a novel natural language inference system that generates accurate labels along with faithful, label-specific explanations, emphasizing the importance of explicit faithfulness evaluation for model explanations.
Contribution
NILE is the first method to produce high-accuracy NLI labels with faithful, label-specific natural language explanations, validated through automated and human evaluations.
Findings
NILE outperforms previous methods in label and explanation accuracy.
NILE's explanations are shown to be faithful through sensitivity analysis.
Explicit faithfulness evaluation is crucial for trustworthy explanations.
Abstract
The recent growth in the popularity and success of deep learning models on NLP classification tasks has accompanied the need for generating some form of natural language explanation of the predicted labels. Such generated natural language (NL) explanations are expected to be faithful, i.e., they should correlate well with the model's internal decision making. In this work, we focus on the task of natural language inference (NLI) and address the following question: can we build NLI systems which produce labels with high accuracy, while also generating faithful explanations of its decisions? We propose Natural-language Inference over Label-specific Explanations (NILE), a novel NLI method which utilizes auto-generated label-specific NL explanations to produce labels along with its faithful explanation. We demonstrate NILE's effectiveness over previously reported methods through automated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
