FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Kushal Lakhotia, Bhargavi Paranjape, Asish Ghoshal, Wen-tau Yih,, Yashar Mehdad, Srinivasan Iyer

TL;DR
FiD-Ex enhances sequence-to-sequence models for NLP explanation tasks by promoting extractive generation, managing long inputs, and improving few-shot learning, leading to better explanation quality and accuracy.
Contribution
Introduces FiD-Ex, a novel approach combining sentence markers, fusion-in-decoder architecture, and intermediate fine-tuning to improve extractive explanations in seq2seq models.
Findings
Outperforms prior models on ERASER benchmark
Improves explanation metrics and task accuracy
Effective in both fully supervised and few-shot settings
Abstract
Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex, which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
