Luzzu - A Framework for Linked Data Quality Assessment

Jeremy Debattista; Christoph Lange; S\"oren Auer

arXiv:1412.3750·cs.DB·January 8, 2016

Luzzu - A Framework for Linked Data Quality Assessment

Jeremy Debattista, Christoph Lange, S\"oren Auer

PDF

TL;DR

Luzzu is an extensible framework for assessing the quality of Linked Open Data, providing detailed metadata and problem reports without relying on SPARQL endpoints, enabling more complex quality metrics.

Contribution

It introduces Luzzu, a novel, extensible framework for Linked Data quality assessment that overcomes limitations of SPARQL-based methods and supports complex metrics.

Findings

01

Assessed multiple datasets using 25 metrics across 10 dimensions.

02

Luzzu provides detailed quality metadata and problem reports.

03

Framework is extensible for third-party metric integration.

Abstract

With the increasing adoption and growth of the Linked Open Data cloud [9], with RDFa, Microformats and other ways of embedding data into ordinary Web pages, and with initiatives such as schema.org, the Web is currently being complemented with a Web of Data. Thus, the Web of Data shares many characteristics with the original Web of Documents, which also varies in quality. This heterogeneity makes it challenging to determine the quality of the data published on the Web and to subsequently make this information explicit to data consumers. The main contribution of this article is LUZZU, a quality assessment framework for Linked Open Data. Apart from providing quality metadata and quality problem reports that can be used for data cleaning, LUZZU is extensible: third party metrics can be easily plugged-in the framework. The framework does not rely on SPARQL endpoints, and is thus free of all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.