SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained Evaluation
Yuwei Wan, Yixuan Liu, Aswathy Ajith, Clara Grazian, Bram Hoex, Wenjie, Zhang, Chunyu Kit, Tong Xie, Ian Foster

TL;DR
SciQAG is a framework that automatically generates a large, high-quality science QA dataset from scientific literature using LLMs, and introduces a benchmark to evaluate LLMs' scientific question-answering capabilities.
Contribution
The paper presents SciQAG, a novel framework for automatic science QA dataset generation and a new benchmark for evaluating LLMs in scientific question answering.
Findings
Fine-tuning LLMs on SciQAG improves performance on scientific QA tasks.
The dataset contains 188,042 QA pairs from 24 scientific domains.
The framework enables large-scale, research-level question generation from scientific literature.
Abstract
We introduce SciQAG, a novel framework for automatically generating high-quality science question-answer pairs from a large corpus of scientific literature based on large language models (LLMs). SciQAG consists of a QA generator and a QA evaluator, which work together to extract diverse and research-level questions and answers from scientific papers. Utilizing this framework, we construct a large-scale, high-quality, open-ended science QA dataset containing 188,042 QA pairs extracted from 22,743 scientific papers across 24 scientific domains. We also introduce SciQAG-24D, a new benchmark task designed to evaluate the science question-answering ability of LLMs. Extensive experiments demonstrate that fine-tuning LLMs on the SciQAG dataset significantly improves their performance on both open-ended question answering and scientific tasks. To foster research and collaboration, we make the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Natural Language Processing Techniques
