TL;DR
SDM-RDFizer is a scalable, high-performance RML interpreter that efficiently converts large, heterogeneous, and duplicate-rich data sources into RDF knowledge graphs, significantly outperforming existing tools.
Contribution
It introduces novel algorithms for executing RML logical operators, enabling scalable and efficient transformation of complex data into RDF graphs.
Findings
SDM-RDFizer is two orders of magnitude faster than existing solutions.
It effectively handles high-duplication and heterogeneous data sources.
The tool is publicly available for use and further development.
Abstract
In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the impact negatively of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
