Knowledge Graph Question Answering Datasets and Their Generalizability: Are They Enough for Future Research?
Longquan Jiang, Ricardo Usbeck

TL;DR
This paper critically analyzes existing KGQA datasets for their ability to support generalizable question answering systems, identifies limitations, and proposes a cost-effective re-splitting method to improve dataset utility for future research.
Contribution
It introduces a novel re-splitting approach for existing KGQA datasets to enhance their generalizability evaluation without additional data collection.
Findings
Re-splitting improves evaluation of generalization in KGQA datasets.
Many existing datasets are based on outdated or discontinued knowledge graphs.
The proposed method is effective and accessible for research groups.
Abstract
Existing approaches on Question Answering over Knowledge Graphs (KGQA) have weak generalizability. That is often due to the standard i.i.d. assumption on the underlying dataset. Recently, three levels of generalization for KGQA were defined, namely i.i.d., compositional, zero-shot. We analyze 25 well-known KGQA datasets for 5 different Knowledge Graphs (KGs). We show that according to this definition many existing and online available KGQA datasets are either not suited to train a generalizable KGQA system or that the datasets are based on discontinued and out-dated KGs. Generating new datasets is a costly process and, thus, is not an alternative to smaller research groups and companies. In this work, we propose a mitigation method for re-splitting available KGQA datasets to enable their applicability to evaluate generalization, without any cost and manual effort. We test our hypothesis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning
