Veer: Verifying Equivalence of Dataflow Versions in Iterative Data Analytics (Extended Version)
Sadeem Alsudais, Avinash Kumar, Chen Li

TL;DR
Veer is a tool that efficiently verifies the equivalence of different dataflow versions in iterative data analytics, enabling performance optimization by reusing previous results, especially when dataflows are large and complex.
Contribution
Veer introduces a window-based approach that decomposes dataflow pairs and leverages existing equivalence verifiers to handle complex, large dataflows effectively.
Findings
Successfully verifies equivalence of complex dataflows
Outperforms existing EVs in efficiency and capability
Supports large-scale real-world dataflows
Abstract
Data analytics using GUI-based dataflows is an iterative process in which an analyst makes many iterations of changes to refine the dataflow, generating a different version at each iteration. In many cases, the result of executing a dataflow version is equivalent to a result of a prior executed version. Identifying such equivalence between the execution results of different dataflow versions is important for optimizing the performance of a dataflow by reusing results from a previous run. The size of the dataflows and the complexity of their operators often make existing equivalence verifiers (EVs) not able to solve the problem. In this paper, we present "Veer," which leverages the fact that two dataflow versions can be very similar except for a few changes. The solution divides the dataflow version pair into small parts, called windows, and verifies the equivalence within each window by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Formal Methods in Verification · Embedded Systems Design Techniques
