ProvLight: Efficient Workflow Provenance Capture on the Edge-to-Cloud Continuum
Daniel Rosendo (ZENITH, KerData), Marta Mattoso (COPPE-UFRJ),, Alexandru Costan (INSA Rennes, IRISA), Renan Souza (ORNL), D\'ebora Pina, (COPPE-UFRJ), Patrick Valduriez (ZENITH), Gabriel Antoniu (PARIS)

TL;DR
ProvLight is a lightweight tool designed to efficiently capture provenance data in hybrid IoT/Edge-Cloud workflows, significantly reducing overhead and enabling performance optimization across the continuum.
Contribution
It introduces ProvLight, a novel provenance capture method that leverages data compression and lightweight protocols, integrated into E2Clab for optimized workflow analysis.
Findings
ProvLight outperforms existing systems in speed, CPU, memory, data transmission, and energy consumption.
Validated at large scale with 64 IoT/Edge devices in the FIT IoT LAB.
Enables efficient, low-overhead provenance capture for resource-constrained devices.
Abstract
Modern scientific workflows require hybrid infrastructures combining numerous decentralized resources on the IoT/Edge interconnected to Cloud/HPC systems (aka the Computing Continuum) to enable their optimized execution. Understanding and optimizing the performance of such complex Edge-to-Cloud workflows is challenging. Capturing the provenance of key performance indicators, with their related data and processes, may assist in understanding and optimizing workflow executions. However, the capture overhead can be prohibitive, particularly in resource-constrained devices, such as the ones on the IoT/Edge.To address this challenge, based on a performance analysis of existing systems, we propose ProvLight, a tool to enable efficient provenance capture on the IoT/Edge. We leverage simplified data models, data compression and grouping, and lightweight transmission protocols to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
