The DLCC Node Classification Benchmark for Analyzing Knowledge Graph Embeddings
Jan Portisch, Heiko Paulheim

TL;DR
The paper introduces the DLCC benchmark to analyze what types of class constructors knowledge graph embeddings can learn, revealing limitations in capturing certain logical constructs like cardinality constraints.
Contribution
It presents a new benchmark and evaluation framework for analyzing the logical expressiveness of knowledge graph embeddings, with real-world and synthetic gold standards.
Findings
Many DL constructors are learned through correlated patterns rather than true logical understanding.
Cardinality constraints are particularly difficult for most embedding approaches to learn.
Embedding approaches often do not fully capture the intended logical class constructors.
Abstract
Knowledge graph embedding is a representation learning technique that projects entities and relations in a knowledge graph to continuous vector spaces. Embeddings have gained a lot of uptake and have been heavily used in link prediction and other downstream prediction tasks. Most approaches are evaluated on a single task or a single group of tasks to determine their overall performance. The evaluation is then assessed in terms of how well the embedding approach performs on the task at hand. Still, it is hardly evaluated (and often not even deeply understood) what information the embedding approaches are actually learning to represent. To fill this gap, we present the DLCC (Description Logic Class Constructors) benchmark, a resource to analyze embedding approaches in terms of which kinds of classes they can represent. Two gold standards are presented, one based on the real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Data Quality and Management · Bayesian Modeling and Causal Inference
