On the Provenance of Linked Data Statistics
William Waites

TL;DR
This paper discusses the challenges in consistently describing and interpreting statistics of linked data graphs, emphasizing the need for clear provenance to understand how such statistics are calculated.
Contribution
It surveys the problem of expressing and understanding linked data statistics and proposes a strategy to address the issue of provenance and interpretability.
Findings
Highlights the difficulty of uniform statistical descriptions
Proposes a strategy for provenance tracking in linked data statistics
Emphasizes the importance of understanding calculation methods
Abstract
As the amount of linked data published on the web grows, attempts are being made to describe and measure it. However even basic statistics about a graph, such as its size, are difficult to express in a uniform and predictable way. In order to be able to sensibly interpret a statistic it is necessary to know how it was calculate. In this paper we survey the nature of the problem and outline a strategy for addressing it.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Biomedical Text Mining and Ontologies
