Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
Xanh Ho, Anh-Khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa

TL;DR
This paper introduces 2WikiMultiHopQA, a new multi-hop QA dataset with explicit reasoning paths, designed to evaluate and improve models' multi-step reasoning abilities using structured and unstructured data.
Contribution
The paper presents a novel dataset with explicit reasoning paths, combining structured and unstructured data, and a pipeline ensuring high-quality, multi-hop questions for comprehensive evaluation.
Findings
The dataset is challenging for existing multi-hop models.
It provides explicit reasoning paths for better interpretability.
Questions are carefully designed to require multiple reasoning steps.
Abstract
A multi-hop question answering (QA) dataset aims to test reasoning and inference skills by requiring a model to read multiple paragraphs to answer a given question. However, current datasets do not provide a complete explanation for the reasoning process from the question to the answer. Further, previous studies revealed that many examples in existing multi-hop datasets do not require multi-hop reasoning to answer a question. In this study, we present a new multi-hop QA dataset, called 2WikiMultiHopQA, which uses structured and unstructured data. In our dataset, we introduce the evidence information containing a reasoning path for multi-hop questions. The evidence information has two benefits: (i) providing a comprehensive explanation for predictions and (ii) evaluating the reasoning skills of a model. We carefully design a pipeline and a set of templates when generating a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
