Nf-PEAK: Process-Based Energy Attribution for Nextflow Workflows on Kubernetes Clusters
Philipp Thamm, Somayeh Mohammadi, Kathleen West, Knut Reinert, Lauritz Thamsen, Ulf Leser

TL;DR
Nf-PEAK is a containerized method that accurately attributes CPU and DRAM energy consumption to individual Nextflow workflow tasks on Kubernetes clusters, aiding energy optimization.
Contribution
It introduces a novel approach combining process identification, performance counters, and a non-linear model for precise energy attribution at task level.
Findings
Achieves 6.6% average MAPE in isolated runs.
Maintains stable accuracy under co-located workloads.
Outperforms existing tools like Kepler in error metrics.
Abstract
Scientific workflows are pipelines of interdependent tasks. They are increasingly executed on shared Kubernetes clusters via workflow engines such as Nextflow. Their energy consumption matters for both cost and sustainability. It is necessary to examine and optimize workflow tasks individually, because they can be very heterogeneous. However, estimating task-level energy on clusters is difficult: Intel RAPL counters report only node-level energy, access to counters and host process information is typically restricted, and concurrent workloads introduce resource contention and measurement noise. We present Nf-PEAK, a containerized method to attribute CPU-package and DRAM energy to individual processes and Nextflow tasks. Nf-PEAK (i) identifies workflow pods, (ii) maps pods to host processes via cgroup metadata, (iii) samples RAPL and per-process performance counters, and (iv) applies a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
