# Scheduling Jobs with Random Resource Requirements in Computing Clusters

**Authors:** Konstantinos Psychas, Javad Ghaderi

arXiv: 1901.05998 · 2019-01-21

## TL;DR

This paper addresses the challenge of scheduling jobs with diverse and potentially infinite resource requirements in distributed computing clusters, proposing algorithms that achieve significant throughput without prior knowledge of workload distributions.

## Contribution

It introduces oblivious scheduling algorithms capable of handling infinite resource requirement types and characterizes fundamental throughput limits in such high-dimensional settings.

## Key findings

- Algorithms achieve at least 1/2 of maximum throughput
- Algorithms achieve at least 2/3 of maximum throughput
- Simulation results confirm effectiveness on synthetic and real data

## Abstract

We consider a natural scheduling problem which arises in many distributed computing frameworks. Jobs with diverse resource requirements (e.g. memory requirements) arrive over time and must be served by a cluster of servers, each with a finite resource capacity. To improve throughput and delay, the scheduler can pack as many jobs as possible in the servers subject to their capacity constraints. Motivated by the ever-increasing complexity of workloads in shared clusters, we consider a setting where the jobs' resource requirements belong to a very large number of diverse types or, in the extreme, even infinitely many types, e.g. when resource requirements are drawn from an unknown distribution over a continuous support. The application of classical scheduling approaches that crucially rely on a predefined finite set of types is discouraging in this high (or infinite) dimensional setting. We first characterize a fundamental limit on the maximum throughput in such setting, and then develop oblivious scheduling algorithms that have low complexity and can achieve at least 1/2 and 2/3 of the maximum throughput, without the knowledge of traffic or resource requirement distribution. Extensive simulation results, using both synthetic and real traffic traces, are presented to verify the performance of our algorithms.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.05998/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1901.05998/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1901.05998/full.md

---
Source: https://tomesphere.com/paper/1901.05998