FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale   Generation

Kushal Lakhotia; Bhargavi Paranjape; Asish Ghoshal; Wen-tau Yih,; Yashar Mehdad; Srinivasan Iyer

arXiv:2012.15482·cs.CL·January 1, 2021

FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

Kushal Lakhotia, Bhargavi Paranjape, Asish Ghoshal, Wen-tau Yih,, Yashar Mehdad, Srinivasan Iyer

PDF

Open Access

TL;DR

FiD-Ex enhances sequence-to-sequence models for NLP explanation tasks by promoting extractive generation, managing long inputs, and improving few-shot learning, leading to better explanation quality and accuracy.

Contribution

Introduces FiD-Ex, a novel approach combining sentence markers, fusion-in-decoder architecture, and intermediate fine-tuning to improve extractive explanations in seq2seq models.

Findings

01

Outperforms prior models on ERASER benchmark

02

Improves explanation metrics and task accuracy

03

Effective in both fully supervised and few-shot settings

Abstract

Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex, which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence