TL;DR
This paper introduces new datasets with annotated explanations for multihop QA, demonstrating improved explanation quality with BERT-based models and proposing generalized reasoning chains for robustness.
Contribution
The authors present three novel explanation datasets for multihop QA and introduce a delexicalized chain representation to enhance explanation robustness and generalization.
Findings
Explanation annotations improve model performance by 14% F1
Delexicalized chains are more robust to perturbations
Datasets enable better understanding of reasoning in QA
Abstract
Despite the rapid progress in multihop question-answering (QA), models still have trouble explaining why an answer is correct, with limited explanation training data available to learn from. To address this, we introduce three explanation datasets in which explanations formed from corpus facts are annotated. Our first dataset, eQASC, contains over 98K explanation annotations for the multihop question answering dataset QASC, and is the first that annotates multiple candidate explanations for each answer. The second dataset eQASC-perturbed is constructed by crowd-sourcing perturbations (while preserving their validity) of a subset of explanations in QASC, to test consistency and generalization of explanation prediction models. The third dataset eOBQA is constructed by adding explanation annotations to the OBQA dataset to test generalization of models trained on eQASC. We show that this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
