# Data Provenance for Sport

**Authors:** Andrew J. Simmons, Scott Barnett, Simon Vajda, Rajesh Vasa

arXiv: 1812.05804 · 2018-12-17

## TL;DR

This paper introduces a custom provenance notation tailored for sport performance analysis, evaluates existing standards and tools, and highlights the need for domain-specific adaptations to improve data traceability.

## Contribution

It proposes a domain-specific provenance notation for sports, maps it to W3C PROV, and assesses the limitations of current tools in capturing sport workflows.

## Key findings

- W3C PROV and VisTrails cannot fully capture sport workflows
- Existing tools have usability and terminology issues
- Domain-specific adaptations are necessary for effective provenance management

## Abstract

Data analysts often discover irregularities in their underlying dataset, which need to be traced back to the original source and corrected. Standards for representing data provenance (i.e. the origins of the data), such as the W3C PROV standard, can assist with this process, however require a mapping between abstract provenance concepts and the domain of use in order to apply them effectively. We propose a custom notation for expressing provenance of information in the sport performance analysis domain, and map our notation to concepts in the W3C PROV standard where possible. We evaluate the functionality of W3C PROV (without specialisations) and the VisTrails workflow manager (without extensions), and find that as is, neither are able to fully capture sport performance analysis workflows, notably due to limitations surrounding capture of automated and manual activities respectively. Furthermore, their notations suffer from ineffective use of visual design space, and present potential usability issues as their terminology is unlikely to match that of sport practitioners. Our findings suggest that one-size-fits-all provenance and workflow systems are a poor fit in practice, and that their notation and functionality need to be optimised for the domain of use.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.05804/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1812.05804/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1812.05804/full.md

---
Source: https://tomesphere.com/paper/1812.05804