Partout: A Distributed Engine for Efficient RDF Processing

Luis Gal\'arraga; Katja Hose; Ralf Schenkel

arXiv:1212.5636·cs.DB·December 27, 2012·41 cites

Partout: A Distributed Engine for Efficient RDF Processing

Luis Gal\'arraga, Katja Hose, Ralf Schenkel

PDF

Open Access

TL;DR

Partout is a distributed RDF processing engine that efficiently manages large-scale semantic data by optimized data fragmentation, distribution, and query planning, outperforming existing systems.

Contribution

It introduces a novel distributed engine with query log-based RDF fragmentation, optimized data placement, and efficient query execution for large-scale semantic web data.

Findings

01

Outperforms state-of-the-art RDF processing systems

02

Efficient handling of updates in distributed RDF data

03

Produces optimized query execution plans for ad-hoc SPARQL queries

Abstract

The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Data Quality and Management