Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of   Reasoning Steps

Xanh Ho; Anh-Khoa Duong Nguyen; Saku Sugawara; Akiko Aizawa

arXiv:2011.01060·cs.CL·November 13, 2020·5 cites

Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps

Xanh Ho, Anh-Khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa

PDF

Open Access 1 Repo 2 Datasets

TL;DR

This paper introduces 2WikiMultiHopQA, a new multi-hop QA dataset with explicit reasoning paths, designed to evaluate and improve models' multi-step reasoning abilities using structured and unstructured data.

Contribution

The paper presents a novel dataset with explicit reasoning paths, combining structured and unstructured data, and a pipeline ensuring high-quality, multi-hop questions for comprehensive evaluation.

Findings

01

The dataset is challenging for existing multi-hop models.

02

It provides explicit reasoning paths for better interpretability.

03

Questions are carefully designed to require multiple reasoning steps.

Abstract

A multi-hop question answering (QA) dataset aims to test reasoning and inference skills by requiring a model to read multiple paragraphs to answer a given question. However, current datasets do not provide a complete explanation for the reasoning process from the question to the answer. Further, previous studies revealed that many examples in existing multi-hop datasets do not require multi-hop reasoning to answer a question. In this study, we present a new multi-hop QA dataset, called 2WikiMultiHopQA, which uses structured and unstructured data. In our dataset, we introduce the evidence information containing a reasoning path for multi-hop questions. The evidence information has two benefits: (i) providing a comprehensive explanation for predictions and (ii) evaluating the reasoning skills of a model. We carefully design a pipeline and a set of templates when generating a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Alab-NII/2wikimultihop
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications