Automatic Inter-document Multi-hop Scientific QA Generation
Seungmin Lee, Dongha Kim, Yuni Jeon, Junyoung Koh, Min Song

TL;DR
This paper introduces AIM-SciQA, a framework for generating large-scale multi-document, multi-hop scientific QA datasets using LLMs, semantic alignment, and citation info, to evaluate scientific reasoning.
Contribution
It presents a novel automated method for creating multi-hop scientific QA datasets that incorporate cross-document reasoning and citation information, advancing scientific QA research.
Findings
Generated 411,409 single-hop and 13,672 multi-hop QAs from PubMed papers.
Validated high factual consistency through human and automatic checks.
Demonstrated the dataset's effectiveness in evaluating reasoning capabilities.
Abstract
Existing automatic scientific question generation studies mainly focus on single-document factoid QA, overlooking the inter-document reasoning crucial for scientific understanding. We present AIM-SciQA, an automated framework for generating multi-document, multi-hop scientific QA datasets. AIM-SciQA extracts single-hop QAs using large language models (LLMs) with machine reading comprehension and constructs cross-document relations based on embedding-based semantic alignment while selectively leveraging citation information. Applied to 8,211 PubMed Central papers, it produced 411,409 single-hop and 13,672 multi-hop QAs, forming the IM-SciQA dataset. Human and automatic validation confirmed high factual consistency, and experimental results demonstrate that IM-SciQA effectively differentiates reasoning capabilities across retrieval and QA stages, providing a realistic and interpretable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Advanced Text Analysis Techniques
