Carbon- and Precedence-Aware Scheduling for Data Processing Clusters
Adam Lechowicz, Rohan Shenoy, Noman Bashir, Mohammad Hajiesmaili, Adam, Wierman, Christina Delimitrou

TL;DR
This paper introduces PCAPS, a scheduler that considers both carbon intensity and task precedence constraints to reduce the carbon footprint of data processing jobs without significantly affecting completion times.
Contribution
The paper presents PCAPS, a novel scheduler that integrates precedence constraints and carbon awareness, and demonstrates its effectiveness through analytical and empirical evaluations.
Findings
PCAPS reduces carbon footprint by up to 32.9%.
It balances carbon reduction and job completion time effectively.
The approach is validated on a 100-node Kubernetes cluster.
Abstract
As large-scale data processing workloads continue to grow, their carbon footprint raises concerns. Prior research on carbon-aware schedulers has focused on shifting computation to align with availability of low-carbon energy, but these approaches assume that each task can be executed independently. In contrast, data processing jobs have precedence constraints (i.e., outputs of one task are inputs for another) that complicate decisions, since delaying an upstream ``bottleneck'' task to a low-carbon period will also block downstream tasks, impacting the entire job's completion time. In this paper, we show that carbon-aware scheduling for data processing benefits from knowledge of both time-varying carbon and precedence constraints. Our main contribution is , a carbon-aware scheduler that interfaces with modern ML scheduling policies to explicitly consider the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Graph Theory and Algorithms
