How Well Do Multi-hop Reading Comprehension Models Understand Date Information?
Xanh Ho, Saku Sugawara, and Akiko Aizawa

TL;DR
This paper introduces HieraDate, a dataset designed to evaluate and improve multi-hop reading comprehension models' understanding of date information through hierarchical probing tasks, revealing current limitations and potential for robustness enhancement.
Contribution
The paper presents HieraDate, a new dataset with hierarchical probing tasks for date reasoning, and demonstrates its utility in diagnosing and improving multi-hop QA models' date comprehension.
Findings
Models struggle with date subtraction despite good comparison performance.
Probing questions improve main task accuracy by over 10 F1 points.
Dataset augmentation enhances model robustness.
Abstract
Several multi-hop reading comprehension datasets have been proposed to resolve the issue of reasoning shortcuts by which questions can be answered without performing multi-hop reasoning. However, the ability of multi-hop models to perform step-by-step reasoning when finding an answer to a comparison question remains unclear. It is also unclear how questions about the internal reasoning process are useful for training and evaluating question-answering (QA) systems. To evaluate the model precisely in a hierarchical manner, we first propose a dataset, \textit{HieraDate}, with three probing tasks in addition to the main question: extraction, reasoning, and robustness. Our dataset is created by enhancing two previous multi-hop datasets, HotpotQA and 2WikiMultiHopQA, focusing on multi-hop questions on date information that involve both comparison and numerical reasoning. We then evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
