PeerQA: A Scientific Question Answering Dataset from Peer Reviews
Tim Baumg\"artner, Ted Briscoe, Iryna Gurevych

TL;DR
PeerQA is a new dataset derived from peer reviews that enables research on scientific question answering, including evidence retrieval, unanswerable question detection, and answer generation for long scientific documents.
Contribution
This paper introduces PeerQA, a novel dataset from peer reviews for scientific QA, and provides baseline systems and analysis for three key QA tasks.
Findings
Decontextualization improves retrieval performance.
PeerQA is a challenging benchmark for long-document QA.
Baseline systems demonstrate the dataset's utility for developing practical QA models.
Abstract
We present PeerQA, a real-world, scientific, document-level Question Answering (QA) dataset. PeerQA questions have been sourced from peer reviews, which contain questions that reviewers raised while thoroughly examining the scientific article. Answers have been annotated by the original authors of each paper. The dataset contains 579 QA pairs from 208 academic articles, with a majority from ML and NLP, as well as a subset of other scientific communities like Geoscience and Public Health. PeerQA supports three critical tasks for developing practical QA systems: Evidence retrieval, unanswerable question classification, and answer generation. We provide a detailed analysis of the collected dataset and conduct experiments establishing baseline systems for all three tasks. Our experiments and analyses reveal the need for decontextualization in document-level retrieval, where we find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsExpert finding and Q&A systems · Topic Modeling
