DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph
Debayan Banerjee, Sushil Awale, Ricardo Usbeck, Chris Biemann

TL;DR
This paper introduces DBLP-QuAD, the largest scholarly question answering dataset with 10,000 question-answer pairs and SPARQL queries over the DBLP knowledge graph, facilitating research in academic information retrieval.
Contribution
The creation of the largest scholarly QA dataset over the DBLP knowledge graph with annotated SPARQL queries for each question.
Findings
Contains 10,000 question-answer pairs
Includes executable SPARQL queries for each question
Enables advanced research in scholarly question answering
Abstract
In this work we create a question answering dataset over the DBLP scholarly knowledge graph (KG). DBLP is an on-line reference for bibliographic information on major computer science publications that indexes over 4.4 million publications published by more than 2.2 million authors. Our dataset consists of 10,000 question answer pairs with the corresponding SPARQL queries which can be executed over the DBLP KG to fetch the correct answer. DBLP-QuAD is the largest scholarly question answering dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
MethodsByte Pair Encoding · Linear Layer · Gated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Inverse Square Root Schedule · Adafactor · Dense Connections · Softmax · Attention Dropout · Dropout
