# Evaluation of pilot jobs for Apache Spark applications on HPC clusters

**Authors:** Valerie Hayot-Sasson, Tristan Glatard

arXiv: 1905.12720 · 2019-05-31

## TL;DR

This paper evaluates the effectiveness of pilot jobs versus traditional batch scheduling for running Apache Spark on HPC clusters, finding limited speed-up benefits and increased complexity with pilot jobs.

## Contribution

It provides an empirical comparison showing that pilot jobs offer minimal speed-up and added complexity, advocating for traditional batch scheduling for Spark on HPC.

## Key findings

- Speed-up with pilot jobs is around 0.98 on average.
- Pilot jobs increase scheduling complexity and debugging difficulty.
- Traditional batch scheduling remains preferable for Spark on HPC.

## Abstract

Big Data has become prominent throughout many scientific fields and, as a result, scientific communities have sought out Big Data frameworks to accelerate the processing of their increasingly data-intensive pipelines. However, while scientific communities typically rely on High-Performance Computing (HPC) clusters for the parallelization of their pipelines, many popular Big Data frameworks such as Hadoop and Apache Spark were primarily designed to be executed on dedicated commodity infrastructures. This paper evaluates the benefits of pilot jobs over traditional batch submission for Apache Spark on HPC clusters. Surprisingly, our results show that the speed-up provided by pilot jobs over batch scheduling is moderate to inexistent (0.98 on average) despite the presence of long queuing times. In addition, pilot jobs provide an extra layer of scheduling that complexifies debugging and deployment. We conclude that traditional batch scheduling should remain the default strategy to deploy Apache Spark applications on HPC clusters.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.12720/full.md

## Figures

23 figures with captions in the complete paper: https://tomesphere.com/paper/1905.12720/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1905.12720/full.md

---
Source: https://tomesphere.com/paper/1905.12720