On the Provenance of Linked Data Statistics

William Waites

arXiv:1410.5077·cs.DB·October 21, 2014

On the Provenance of Linked Data Statistics

William Waites

PDF

Open Access

TL;DR

This paper discusses the challenges in consistently describing and interpreting statistics of linked data graphs, emphasizing the need for clear provenance to understand how such statistics are calculated.

Contribution

It surveys the problem of expressing and understanding linked data statistics and proposes a strategy to address the issue of provenance and interpretability.

Findings

01

Highlights the difficulty of uniform statistical descriptions

02

Proposes a strategy for provenance tracking in linked data statistics

03

Emphasizes the importance of understanding calculation methods

Abstract

As the amount of linked data published on the web grows, attempts are being made to describe and measure it. However even basic statistics about a graph, such as its size, are difficult to express in a uniform and predictable way. In order to be able to sensibly interpret a statistic it is necessary to know how it was calculate. In this paper we survey the nature of the problem and outline a strategy for addressing it.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Biomedical Text Mining and Ontologies