Creating Knowledge Graphs Subsets using Shape Expressions
Jose Emilio Labra Gayo

TL;DR
This paper introduces formal models for different types of knowledge graphs and proposes methods, including Shape Expressions and a Pregel-based algorithm, to create domain-specific subsets for improved quality and practical use.
Contribution
It extends Shape Expressions to property and wikibase graphs and presents a novel Pregel-based validation algorithm for large-scale knowledge graph subset creation.
Findings
Extended ShEx to property and wikibase graphs.
Developed a Pregel-based validation algorithm for big data graphs.
Implemented subset creation methods on Apache Spark GraphX.
Abstract
The initial adoption of knowledge graphs by Google and later by big companies has increased their adoption and popularity. In this paper we present a formal model for three different types of knowledge graphs which we call RDF-based graphs, property graphs and wikibase graphs. In order to increase the quality of Knowledge Graphs, several approaches have appeared to describe and validate their contents. Shape Expressions (ShEx) has been proposed as concise language for RDF validation. We give a brief introduction to ShEx and present two extensions that can also be used to describe and validate property graphs (PShEx) and wikibase graphs (WShEx). One problem of knowledge graphs is the large amount of data they contain, which jeopardizes their practical application. In order to palliate this problem, one approach is to create subsets of those knowledge graphs for some domains. We propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Graph Neural Networks · Natural Language Processing Techniques
