MapSDI: A Scaled-up Semantic Data Integration Framework for Knowledge Graph Creation
Samaneh Jozashoori, Maria-Esther Vidal

TL;DR
MapSDI is a scalable framework that enhances semantic data integration for knowledge graph creation, significantly reducing processing time while maintaining data integrity, suitable for large and diverse datasets.
Contribution
The paper introduces MapSDI, a novel mapping rule-based framework that improves scalability and efficiency in semantic data integration for knowledge graphs.
Findings
Knowledge graph creation time reduced by an order of magnitude.
MapSDI ensures data-lossless source and rule transformations.
Effective preprocessing improves RDFizer performance.
Abstract
Semantic web technologies have significantly contributed with effective solutions for the problems of data integration and knowledge graph creation. However, with the rapid growth of big data in diverse domains, different interoperability issues still demand to be addressed, being scalability one of the main challenges. In this paper, we address the problem of knowledge graph creation at scale and provide MapSDI, a mapping rule-based framework for optimizing semantic data integration into knowledge graphs. MapSDI allows for the semantic enrichment of large-sized, heterogeneous, and potentially low-quality data efficiently. The input of MapSDI is a set of data sources and mapping rules being generated by a mapping language such as RML. First, MapSDI pre-processes the sources based on semantic information extracted from mapping rules, by performing basic database operators; it projects…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
