TL;DR
This paper demonstrates how to implement language-integrated provenance in Haskell, enabling systematic tracking of data origins using advanced programming techniques, thus bridging the gap between provenance research and practical database systems.
Contribution
It adapts language-integrated provenance techniques from Links to Haskell, overcoming technical challenges and providing a reusable approach for provenance in mainstream programming languages.
Findings
Successfully implemented provenance tracking in Haskell
Overcame technical challenges with Haskell's features
Provides a reusable framework for provenance in Haskell
Abstract
Scientific progress increasingly depends on data management, particularly to clean and curate data so that it can be systematically analyzed and reused. A wealth of techniques for managing and curating data (and its provenance) have been proposed, largely in the database community. In particular, a number of influential papers have proposed collecting provenance information explaining where a piece of data was copied from, or what other records were used to derive it. Most of these techniques, however, exist only as research prototypes and are not available in mainstream database systems. This means scientists must either implement such techniques themselves or (all too often) go without. This is essentially a code reuse problem: provenance techniques currently cannot be implemented reusably, only as ad hoc, usually unmaintained extensions to standard databases. An alternative,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
