Comparing Workflow Application Designs for High Resolution Satellite Image Analysis
Aymen Al-Saadi, Ioannis Paraskevakos, Bento Collares Gon\c{c}alves,, Heather J. Lynch, Shantenu Jha, Matteo Turilli

TL;DR
This paper compares three task-parallel workflow designs for high-resolution satellite image analysis on HPC, evaluating their performance, resource utilization, and suitability for large-scale ecological monitoring tasks.
Contribution
It introduces and evaluates three novel workflow designs for processing large satellite datasets on high-performance computing systems.
Findings
Design 2 offers the best balance of efficiency and resource utilization.
Workflow performance varies significantly with dataset size and task heterogeneity.
The modeling approach accurately predicts execution times and helps select optimal workflow design.
Abstract
Very High Resolution satellite and aerial imagery are used to monitor and conduct large scale surveys of ecological systems. Convolutional Neural Networks have successfully been employed to analyze such imagery to detect large animals and salient features. As the datasets increase in volume and number of images, utilizing High Performance Computing resources becomes necessary. In this paper, we investigate three task-parallel, data-driven workflow designs to support imagery analysis pipelines with heterogeneous tasks on HPC. We analyze the capabilities of each design when processing datasets from two use cases for a total of 4,672 satellite and aerial images, and 8.35 TB of data. We experimentally model the execution time of the tasks of the image processing pipelines. We perform experiments to characterize the resource utilization, total time to completion, and overheads of each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
