CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Tara Safavi; Danai Koutra

arXiv:2009.07810·cs.CL·October 7, 2020

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Tara Safavi, Danai Koutra

PDF

Open Access 2 Repos

TL;DR

CoDEx introduces a new, more challenging and diverse knowledge graph completion benchmark derived from Wikidata and Wikipedia, with extensive datasets, multilingual descriptions, and hard negatives, to advance research in link prediction.

Contribution

It provides a comprehensive, multi-faceted benchmark with detailed analyses, baseline results, and highlights its increased difficulty and diversity over existing datasets like FB15K-237.

Findings

01

CoDEx covers more diverse and interpretable content.

02

It is more difficult for current embedding models.

03

Baseline experiments show varying performance across datasets.

Abstract

We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Data Quality and Management · Topic Modeling