A Formal Semantics for Data Analytics Pipelines
Maurizio Drocco, Claudia Misale, Guy Tremblay, Marco, Aldinucci

TL;DR
This paper introduces PiCo, a formal programming model for data analytics pipelines emphasizing polymorphic operators and a data-centric semantics, enabling flexible, reusable, and easily updatable data workflows across diverse data models.
Contribution
It presents a novel formal semantics and polymorphic operator design for PiCo, a DSL for data pipelines, improving reusability and adaptability over existing frameworks.
Findings
Polymorphic operators enable reuse across data types.
Formal semantics facilitate reasoning about data transformations.
Pipeline updates are simplified without affecting context.
Abstract
In this report, we present a new programming model based on Pipelines and Operators, which are the building blocks of programs written in PiCo, a DSL for Data Analytics Pipelines. In the model we propose, we use the term Pipeline to denote a workflow that processes data collections -- rather than a computational process -- as is common in the data processing community. The novelty with respect to other frameworks is that all PiCo operators are polymorphic with respect to data types. This makes it possible to 1) re-use the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc); 2) reuse the same operators in different contexts, and 3) update operators without affecting the calling context, i.e., the previous and following stages in the pipeline. Notice that in other mainstream frameworks, such as Spark, the update of a pipeline by changing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Scientific Computing and Data Management · Semantic Web and Ontologies
